Using xgboost.Booster.predict can only get the prediction result of all the tree or the predicted leaf of each tree. But how could I get the prediction value of each tree?
As of recently, xgboost
has introduced a slicing API, and Raul's answer, while valid, is overly complicated.
To get individual predictions all you need is to iterate through the booster
object.
individual_preds = []
for tree_ in model.get_booster():
individual_preds.append(
tree_.predict(xgb.DMatrix(X))
)
Note however, that those individual predictions are not individual contributions. E.g. summing them up will not get the final prediction. For that we need to transform them back into log-odds and then sum up:
from scipy.special import expit as sigmoid, logit as inverse_sigmoid
individual_preds = np.vstack(individual_preds)
indivudual_logits = inverse_sigmoid(individual_preds)
final_logits = indivudual_logits.sum(axis=0)
final_preds = sigmoid(final_logits)
Fully reproducible example, replicating Raul's efforts
import numpy as np
import xgboost as xgb
from sklearn import datasets
from scipy.special import expit as sigmoid, logit as inverse_sigmoid
# Load data
iris = datasets.load_iris()
X, y = iris.data, (iris.target == 1).astype(int)
# Fit a model
model = xgb.XGBClassifier(
n_estimators=10,
max_depth=10,
use_label_encoder=False,
objective='binary:logistic'
)
model.fit(X, y)
booster_ = model.get_booster()
# Extract indivudual predictions
individual_preds = []
for tree_ in booster_:
individual_preds.append(
tree_.predict(xgb.DMatrix(X))
)
individual_preds = np.vstack(individual_preds)
# Aggregated individual predictions to final predictions
indivudual_logits = inverse_sigmoid(individual_preds)
final_logits = indivudual_logits.sum(axis=0)
final_preds = sigmoid(final_logits)
# Verify correctness
xgb_preds = booster_.predict(xgb.DMatrix(X))
np.testing.assert_almost_equal(final_preds, xgb_preds)