Why calling fit resets custom objective function in XGBClassifier?

I have tried to setup XGBoost sklearn API XGBClassifier to use custom objective function (brier) according to the documentation:

    .. note::  Custom objective function

        A custom objective function can be provided for the ``objective``
        parameter. In this case, it should have the signature
        ``objective(y_true, y_pred) -> grad, hess``:

        y_true: array_like of shape [n_samples]
            The target values
        y_pred: array_like of shape [n_samples]
            The predicted values

        grad: array_like of shape [n_samples]
            The value of the gradient for each sample point.
        hess: array_like of shape [n_samples]
            The value of the second derivative for each sample point

Here's my attempt:

import numpy as np
from xgboost import XGBClassifier
from sklearn.datasets import load_svmlight_file
train_data = load_svmlight_file('~/agaricus.txt.train')
X = train_data[0].toarray()
y = train_data[1]

def brier(y_true, y_pred):
    y_pred = 1.0 / (1.0 + np.exp(-y_pred))
    grad = 2 * y_pred * (y_true - y_pred) * (y_pred - 1)
    hess = 2 * y_pred ** (1 - y_pred) * (2 * y_pred * (y_true + 1) - y_true - 3 * y_pred ** 2)
    return grad, hess

m = XGBClassifier(objective=brier, seed=42)

It seemingly results in correct object:

XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
              colsample_bynode=None, colsample_bytree=None, gamma=None,
              gpu_id=None, importance_type='gain', interaction_constraints=None,
              learning_rate=None, max_delta_step=None, max_depth=None,
              min_child_weight=None, missing=nan, monotone_constraints=None,
              n_estimators=100, n_jobs=None, num_parallel_tree=None,
              objective=<function brier at 0x7fe7ac418290>, random_state=None,
              reg_alpha=None, reg_lambda=None, scale_pos_weight=None, seed=42,
              subsample=None, tree_method=None, validate_parameters=False,
              verbosity=None)

However, calling .fit method seems to reset m object to default setup:

m.fit(X, y)
m
XGBClassifier(base_score=0.5, booster=None, colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1,
              importance_type='gain', interaction_constraints=None,
              learning_rate=0.300000012, max_delta_step=0, max_depth=6,
              min_child_weight=1, missing=nan, monotone_constraints=None,
              n_estimators=100, n_jobs=0, num_parallel_tree=1,
              objective='binary:logistic', random_state=42, reg_alpha=0,
              reg_lambda=1, scale_pos_weight=1, seed=42, subsample=1,
              tree_method=None, validate_parameters=False, verbosity=None)

with objective='binary:logistic'. I have noticed that while investigating why I am getting worse brier score when optimising directly for brier than when I use default binary:logistic, as described here.

So, how can I properly setup XGBClassifier to use my function brier as custom objective?

Solution

I believe you are mistaking objective with objective function(obj as parameter), the xgboost documentation is quite confusing sometimes.

In short to your question you just need to fix to this:

m = XGBClassifier(obj=brier, seed=42)

A bit more in depth, objective is how xgboost will optimize given an objective function. Usually xgboost infers optimize from number of classes in your y vector.

I took a snippet from the source code, as you can see whenever you have only two classes the objective is set to binary:logistic:

class XGBClassifier(XGBModel, XGBClassifierBase):
    def __init__(self, objective="binary:logistic", **kwargs):
        super().__init__(objective=objective, **kwargs)

    def fit(self, X, y, sample_weight=None, base_margin=None,
            eval_set=None, eval_metric=None,
            early_stopping_rounds=None, verbose=True, xgb_model=None,
            sample_weight_eval_set=None, callbacks=None):

        evals_result = {}
        self.classes_ = np.unique(y)
        self.n_classes_ = len(self.classes_)

        xgb_options = self.get_xgb_params() # <-- obj function is set here

        if callable(self.objective):
            obj = _objective_decorator(self.objective) # <----- here is the mismatch of the names, if you pass objective as your brie func it will become "binary:logistic"
            xgb_options["objective"] = "binary:logistic"
        else:
            obj = None

        if self.n_classes_ > 2:
            xgb_options['objective'] = 'multi:softprob' # <----- objective is being set here if n_classes> 2
            xgb_options['num_class'] = self.n_classes_

+-- 35 lines: feval = eval_metric if callable(eval_metric) else None-----------------------------------------------------------------------------------------------------------------------------------------------------

        self._Booster = train(xgb_options, train_dmatrix, # <----- objective is being passed in xgb_options dictionary
                              self.get_num_boosting_rounds(),
                              evals=evals,
                              early_stopping_rounds=early_stopping_rounds,
                              evals_result=evals_result, obj=obj, feval=feval, # <----- obj function is being passed to lower level api here
                              verbose_eval=verbose, xgb_model=xgb_model,
                              callbacks=callbacks)

+-- 12 lines: self.objective = xgb_options["objective"]------------------------------------------------------------------------------------------------------------------------------------------------------------------

        return self

There is a fixed list of objectiveslists of objectives you can set:

objective [default=reg:squarederror]

reg:squarederror: regression with squared loss.

reg:squaredlogerror: regression with squared log loss 12[𝑙𝑜𝑔(𝑝𝑟𝑒𝑑+1)−𝑙𝑜𝑔(𝑙𝑎𝑏𝑒𝑙+1)]2. All input labels are required to be greater than -1. Also, see metric rmsle for possible issue with this objective.

reg:logistic: logistic regression

binary:logistic: logistic regression for binary classification, output probability

binary:logitraw: logistic regression for binary classification, output score before logistic transformation

binary:hinge: hinge loss for binary classification. This makes predictions of 0 or 1, rather than producing probabilities.

count:poisson –poisson regression for count data, output mean of poisson distribution

max_delta_step is set to 0.7 by default in poisson regression (used to safeguard optimization)

survival:cox: Cox regression for right censored survival time data (negative values are considered right censored). Note that predictions are returned on the hazard ratio scale (i.e., as HR = exp(marginal_prediction) in the proportional hazard function h(t) = h0(t) * HR).

multi:softmax: set XGBoost to do multiclass classification using the softmax objective, you also need to set num_class(number of classes)

multi:softprob: same as softmax, but output a vector of ndata * nclass, which can be further reshaped to ndata * nclass matrix. The result contains predicted probability of each data point belonging to each class.

rank:pairwise: Use LambdaMART to perform pairwise ranking where the pairwise loss is minimized

rank:ndcg: Use LambdaMART to perform list-wise ranking where Normalized Discounted Cumulative Gain (NDCG) is maximized

rank:map: Use LambdaMART to perform list-wise ranking where Mean Average Precision (MAP) is maximized

reg:gamma: gamma regression with log-link. Output is a mean of gamma distribution. It might be useful, e.g., for modeling insurance claims severity, or for any outcome that might be gamma-distributed.

reg:tweedie: Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any outcome that might be Tweedie-distributed.

Just confirming that objective can't be your brie function, manually setting the objective to be your brie function inside the source code right before calling the lower level api

class XGBClassifier(XGBModel, XGBClassifierBase):
    def __init__(self, objective="binary:logistic", **kwargs):
        super().__init__(objective=objective, **kwargs)

    def fit(self, X, y, sample_weight=None, base_margin=None,
            eval_set=None, eval_metric=None,
            early_stopping_rounds=None, verbose=True, xgb_model=None,
            sample_weight_eval_set=None, callbacks=None):

+-- 54 lines: evals_result = {}--------------------------------------------------------------------
        xgb_options["objective"] = xgb_options["obj"]
        self._Booster = train(xgb_options, train_dmatrix,
                              self.get_num_boosting_rounds(),
                              evals=evals,
                              early_stopping_rounds=early_stopping_rounds,
                              evals_result=evals_result, obj=obj, feval=feval,
                              verbose_eval=verbose, xgb_model=xgb_model,
                              callbacks=callbacks)

+-- 14 lines: self.objective = xgb_options["objective"]--------------------------------------------

Throws this error:

    raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: [10:09:53] /private/var/folders/z5/mchb9bz51cx3h97nkw9v0wkr0000gn/T/pip-install-kh801rm0/xgboost/xgboost/src/objective/objective.cc:26: Unknown objective function: `<function brier at 0x10b630d08>`
Objective candidate: binary:hinge
Objective candidate: multi:softmax
Objective candidate: multi:softprob
Objective candidate: rank:pairwise
Objective candidate: rank:ndcg
Objective candidate: rank:map
Objective candidate: reg:squarederror
Objective candidate: reg:squaredlogerror
Objective candidate: reg:logistic
Objective candidate: binary:logistic
Objective candidate: binary:logitraw
Objective candidate: reg:linear
Objective candidate: count:poisson
Objective candidate: survival:cox
Objective candidate: reg:gamma
Objective candidate: reg:tweedie