I am using python to fit an xgboost model incrementally (chunk by chunk). I came across a solution that uses xgboost.train but I do not know what to do with the Booster object that it returns. For instance, the XGBClassifier has options like fit, predict, predict_proba etc.
Here is what happens inside the for loop that I am reading in the data little by little:
dtrain=xgb.DMatrix(X_train, label=y)
param = {'max_depth':2, 'eta':1, 'silent':1, 'objective':'binary:logistic'}
modelXG=xgb.train(param,dtrain,xgb_model='xgbmodel')
modelXG.save_model("xgbmodel")
XGBClassifier
is a scikit-learn
compatible class which can be used in conjunction with other scikit-learn utilities.
Other than that, its just a wrapper over the xgb.train
, in which you dont need to supply advanced objects like Booster
etc.
Just send your data to fit()
, predict()
etc and internally it will be converted to appropriate objects automatically.