I'm studying xgboost and novice for gradient boost. In gradient tree boosting, loss function is derived by second order approximation calculating gi, hi. You can see it on https://xgboost.readthedocs.io/en/latest/model.html#the-structure-score . Given a dataset, how can I see the value gi, hi such as g1, h1, g2, h2,..?
I saw _train_internal and several functions in training.py and sklean.py. But I didn't find it. By understanding how it calculated and got efficiently, it could be possible to apply further algorithm used in xgboost such as quantile percentile sketch.
Thanks.
To keep track of the gradient updates in each iteration, you would need to expose the training loop in python (rather than have it execute internally in the C++ implementation), and provide custom gradient and hessian implementations. For many of the standard loss functions, e.g., squared loss, logistic loss, this is quite simple and not hard to find in standard references. Here's an example showing how to expose the training loop for logistic regression.
import numpy as np
import xgboost as xgb
from sklearn.datasets import make_classification
from sklearn.metrics import confusion_matrix
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def logregobj(preds, dtrain):
"""log likelihood loss"""
labels = dtrain.get_label()
preds = sigmoid(preds)
grad = preds - labels
hess = preds * (1.0-preds)
return grad, hess
# Build a toy dataset.
X, Y = make_classification(n_samples=1000, n_features=5, n_redundant=0, n_informative=3,
random_state=1, n_clusters_per_class=1)
# Instantiate a Booster object to do the heavy lifting
dtrain = xgb.DMatrix(X, label=Y)
params = {'max_depth': 2, 'eta': 1, 'silent': 1}
num_round = 2
model = xgb.Booster(params, [dtrain])
# Run 10 boosting iterations
# g and h can be monitored for gradient statistics
for _ in range(10):
pred = model.predict(dtrain)
g, h = logregobj(pred, dtrain)
model.boost(dtrain, g, h)
# Evaluate predictions
yhat = model.predict(dtrain)
yhat = 1.0 / (1.0 + np.exp(-yhat))
yhat_labels = np.round(yhat)
confusion_matrix(Y, yhat_labels)