Search code examples
pythonxgboostxgbclassifier

Why calling fit resets custom objective function in XGBClassifier?


I have tried to setup XGBoost sklearn API XGBClassifier to use custom objective function (brier) according to the documentation:

    .. note::  Custom objective function

        A custom objective function can be provided for the ``objective``
        parameter. In this case, it should have the signature
        ``objective(y_true, y_pred) -> grad, hess``:

        y_true: array_like of shape [n_samples]
            The target values
        y_pred: array_like of shape [n_samples]
            The predicted values

        grad: array_like of shape [n_samples]
            The value of the gradient for each sample point.
        hess: array_like of shape [n_samples]
            The value of the second derivative for each sample point

Here's my attempt:

import numpy as np
from xgboost import XGBClassifier
from sklearn.datasets import load_svmlight_file
train_data = load_svmlight_file('~/agaricus.txt.train')
X = train_data[0].toarray()
y = train_data[1]

def brier(y_true, y_pred):
    y_pred = 1.0 / (1.0 + np.exp(-y_pred))
    grad = 2 * y_pred * (y_true - y_pred) * (y_pred - 1)
    hess = 2 * y_pred ** (1 - y_pred) * (2 * y_pred * (y_true + 1) - y_true - 3 * y_pred ** 2)
    return grad, hess

m = XGBClassifier(objective=brier, seed=42)

It seemingly results in correct object:

XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
              colsample_bynode=None, colsample_bytree=None, gamma=None,
              gpu_id=None, importance_type='gain', interaction_constraints=None,
              learning_rate=None, max_delta_step=None, max_depth=None,
              min_child_weight=None, missing=nan, monotone_constraints=None,
              n_estimators=100, n_jobs=None, num_parallel_tree=None,
              objective=<function brier at 0x7fe7ac418290>, random_state=None,
              reg_alpha=None, reg_lambda=None, scale_pos_weight=None, seed=42,
              subsample=None, tree_method=None, validate_parameters=False,
              verbosity=None)

However, calling .fit method seems to reset m object to default setup:

m.fit(X, y)
m
XGBClassifier(base_score=0.5, booster=None, colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1,
              importance_type='gain', interaction_constraints=None,
              learning_rate=0.300000012, max_delta_step=0, max_depth=6,
              min_child_weight=1, missing=nan, monotone_constraints=None,
              n_estimators=100, n_jobs=0, num_parallel_tree=1,
              objective='binary:logistic', random_state=42, reg_alpha=0,
              reg_lambda=1, scale_pos_weight=1, seed=42, subsample=1,
              tree_method=None, validate_parameters=False, verbosity=None)

with objective='binary:logistic'. I have noticed that while investigating why I am getting worse brier score when optimising directly for brier than when I use default binary:logistic, as described here.

So, how can I properly setup XGBClassifier to use my function brier as custom objective?


Solution

  • I believe you are mistaking objective with objective function(obj as parameter), the xgboost documentation is quite confusing sometimes.

    In short to your question you just need to fix to this:

    m = XGBClassifier(obj=brier, seed=42)
    

    A bit more in depth, objective is how xgboost will optimize given an objective function. Usually xgboost infers optimize from number of classes in your y vector.

    I took a snippet from the source code, as you can see whenever you have only two classes the objective is set to binary:logistic:

    class XGBClassifier(XGBModel, XGBClassifierBase):
        def __init__(self, objective="binary:logistic", **kwargs):
            super().__init__(objective=objective, **kwargs)
    
        def fit(self, X, y, sample_weight=None, base_margin=None,
                eval_set=None, eval_metric=None,
                early_stopping_rounds=None, verbose=True, xgb_model=None,
                sample_weight_eval_set=None, callbacks=None):
    
            evals_result = {}
            self.classes_ = np.unique(y)
            self.n_classes_ = len(self.classes_)
    
            xgb_options = self.get_xgb_params() # <-- obj function is set here
    
            if callable(self.objective):
                obj = _objective_decorator(self.objective) # <----- here is the mismatch of the names, if you pass objective as your brie func it will become "binary:logistic"
                xgb_options["objective"] = "binary:logistic"
            else:
                obj = None
    
            if self.n_classes_ > 2:
                xgb_options['objective'] = 'multi:softprob' # <----- objective is being set here if n_classes> 2
                xgb_options['num_class'] = self.n_classes_
    
    +-- 35 lines: feval = eval_metric if callable(eval_metric) else None-----------------------------------------------------------------------------------------------------------------------------------------------------
    
            self._Booster = train(xgb_options, train_dmatrix, # <----- objective is being passed in xgb_options dictionary
                                  self.get_num_boosting_rounds(),
                                  evals=evals,
                                  early_stopping_rounds=early_stopping_rounds,
                                  evals_result=evals_result, obj=obj, feval=feval, # <----- obj function is being passed to lower level api here
                                  verbose_eval=verbose, xgb_model=xgb_model,
                                  callbacks=callbacks)
    
    +-- 12 lines: self.objective = xgb_options["objective"]------------------------------------------------------------------------------------------------------------------------------------------------------------------
    
            return self
    

    There is a fixed list of objectiveslists of objectives you can set:

    objective [default=reg:squarederror]

    reg:squarederror: regression with squared loss.
    
    reg:squaredlogerror: regression with squared log loss 12[π‘™π‘œπ‘”(π‘π‘Ÿπ‘’π‘‘+1)βˆ’π‘™π‘œπ‘”(π‘™π‘Žπ‘π‘’π‘™+1)]2. All input labels are required to be greater than -1. Also, see metric rmsle for possible issue with this objective.
    
    reg:logistic: logistic regression
    
    binary:logistic: logistic regression for binary classification, output probability
    
    binary:logitraw: logistic regression for binary classification, output score before logistic transformation
    
    binary:hinge: hinge loss for binary classification. This makes predictions of 0 or 1, rather than producing probabilities.
    
    count:poisson –poisson regression for count data, output mean of poisson distribution
    
    max_delta_step is set to 0.7 by default in poisson regression (used to safeguard optimization)
    
    survival:cox: Cox regression for right censored survival time data (negative values are considered right censored). Note that predictions are returned on the hazard ratio scale (i.e., as HR = exp(marginal_prediction) in the proportional hazard function h(t) = h0(t) * HR).
    
    multi:softmax: set XGBoost to do multiclass classification using the softmax objective, you also need to set num_class(number of classes)
    
    multi:softprob: same as softmax, but output a vector of ndata * nclass, which can be further reshaped to ndata * nclass matrix. The result contains predicted probability of each data point belonging to each class.
    
    rank:pairwise: Use LambdaMART to perform pairwise ranking where the pairwise loss is minimized
    
    rank:ndcg: Use LambdaMART to perform list-wise ranking where Normalized Discounted Cumulative Gain (NDCG) is maximized
    
    rank:map: Use LambdaMART to perform list-wise ranking where Mean Average Precision (MAP) is maximized
    
    reg:gamma: gamma regression with log-link. Output is a mean of gamma distribution. It might be useful, e.g., for modeling insurance claims severity, or for any outcome that might be gamma-distributed.
    
    reg:tweedie: Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any outcome that might be Tweedie-distributed.
    

    Just confirming that objective can't be your brie function, manually setting the objective to be your brie function inside the source code right before calling the lower level api

    class XGBClassifier(XGBModel, XGBClassifierBase):
        def __init__(self, objective="binary:logistic", **kwargs):
            super().__init__(objective=objective, **kwargs)
    
        def fit(self, X, y, sample_weight=None, base_margin=None,
                eval_set=None, eval_metric=None,
                early_stopping_rounds=None, verbose=True, xgb_model=None,
                sample_weight_eval_set=None, callbacks=None):
    
    +-- 54 lines: evals_result = {}--------------------------------------------------------------------
            xgb_options["objective"] = xgb_options["obj"]
            self._Booster = train(xgb_options, train_dmatrix,
                                  self.get_num_boosting_rounds(),
                                  evals=evals,
                                  early_stopping_rounds=early_stopping_rounds,
                                  evals_result=evals_result, obj=obj, feval=feval,
                                  verbose_eval=verbose, xgb_model=xgb_model,
                                  callbacks=callbacks)
    
    +-- 14 lines: self.objective = xgb_options["objective"]--------------------------------------------
    

    Throws this error:

        raise XGBoostError(py_str(_LIB.XGBGetLastError()))
    xgboost.core.XGBoostError: [10:09:53] /private/var/folders/z5/mchb9bz51cx3h97nkw9v0wkr0000gn/T/pip-install-kh801rm0/xgboost/xgboost/src/objective/objective.cc:26: Unknown objective function: `<function brier at 0x10b630d08>`
    Objective candidate: binary:hinge
    Objective candidate: multi:softmax
    Objective candidate: multi:softprob
    Objective candidate: rank:pairwise
    Objective candidate: rank:ndcg
    Objective candidate: rank:map
    Objective candidate: reg:squarederror
    Objective candidate: reg:squaredlogerror
    Objective candidate: reg:logistic
    Objective candidate: binary:logistic
    Objective candidate: binary:logitraw
    Objective candidate: reg:linear
    Objective candidate: count:poisson
    Objective candidate: survival:cox
    Objective candidate: reg:gamma
    Objective candidate: reg:tweedie