XGBoost¶
This page lists the XGBoost (using the Scikit-Learn-like API) classes and functions that Bodo supports natively inside JIT functions.
Installing XGBoost¶
You will need to build XGBoost with MPI support from source.
XGBoost version must be <= 1.5.1
. Refer to XGBoost instructions about building requirements for more details.
Then, build XGBoost with MPI support from source and install it in your Bodo environment as follows:
git clone --recursive https://github.com/dmlc/xgboost --branch v1.5.1
cd xgboost
mkdir build
cd build
cmake -DRABIT_BUILD_MPI=ON ..
make -j4
cd ../python-package
python setup.py install
xgboost.XGBClassifier
¶
This class provides implementation of the scikit-learn API for XGBoost classification with distributed large-scale learning.
Methods¶
xgboost.XGBClassifier.fit
¶
-
xgboost.XGBClassifier. fit (X, y, sample_weight=None, base_margin=None, eval_set=None, eval_metric=None, early_stopping_rounds=None, verbose=True, xgb_model=None, sample_weight_eval_set=None, feature_weights=None, callbacks=None)
Supported Argumentsargument
datatypes
X
NumPy Array or Pandas Dataframes
y
NumPy Array or Pandas Dataframes
xgboost.XGBClassifier.predict
¶
-
xgboost.XGBClassifier. predict (X, output_margin=False, ntree_limit=None, validate_features=True, base_margin=None)Supported Arguments
argument
datatypes
X
NumPy Array or Pandas Dataframes
xgboost.XGBClassifier.predict_proba
¶
-
xgboost.XGBClassifier. predict_proba (X, ntree_limit=None, validate_features=True, base_margin=None)
Supported Argumentsargument
datatypes
X
NumPy Array or Pandas Dataframes
Attributes¶
xgboost.XGBClassifier.feature_importances_
¶
xgboost.XGBClassifier. feature_importances_
Example Usage:¶
>>> import bodo
>>> import xgboost as xgb
>>> import numpy as np
>>> @bodo.jit
>>> def test_xgbc():
... X = np.random.rand(5, 10)
... y = np.random.randint(0, 2, 5)
... clf = xgb.XGBClassifier(
... booster="gbtree",
... random_state=0,
... tree_method="hist",
... )
... clf.fit(X, y)
... print(clf.predict([[1, 2, 3, 4, 5, 6]]))
... print(clf.feature_importances_)
...
>>> test_xgbc(X, y)
[1]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
xgboost.XGBRegressor
¶
This class provides implementation of the scikit-learn API for XGBoost regression with distributed large-scale learning.
Methods¶
xgboost.XGBRegressor.fit
¶
-
xgboost.XGBRegressor. fit (X, y, sample_weight=None, base_margin=None, eval_set=None, eval_metric=None, early_stopping_rounds=None, verbose=True, xgb_model=None, sample_weight_eval_set=None, feature_weights=None, callbacks=None)
Supported Argumentsargument
datatypes
X
NumPy Array
y
NumPy Array
xgboost.XGBRegressor.predict
¶
-
xgboost.XGBRegressor. predict (X, output_margin=False, ntree_limit=None, validate_features=True, base_margin=None)
Supported Argumentsargument
datatypes
X
NumPy Array
Attributes¶
xgboost.XGBRegressor.feature_importances_
¶
xgboost.XGBRegressor. feature_importances_
Example Usage¶
>>> import bodo
>>> import xgboost as xgb
>>> import numpy as np
>>> np.random.seed(42)
>>> @bodo.jit
>>> def test_xgbc():
... X = np.random.rand(5, 10)
... y = np.random.rand(5)
... clf = xgb.XGBRegressor()
... clf.fit(X, y)
... print(clf.predict([[1, 2, 3, 4, 5, 6]]))
... print(clf.feature_importances_)
...
>>> test_xgbc(X, y)
[0.84368145]
[5.7460850e-01 1.2052832e-04 0.0000000e+00 4.2441860e-01 1.5441242e-04
6.9795933e-04 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]