Search code examples
xgboostmulticlass-classification

Which is the loss function for multi-class classification in XGBoost?


I'm trying to know which loss function uses XGBoost for multi-class classification. I found in this question the loss function for logistic classification in the binary case.

I had though that for the multi-class case it might be the same as in GBM (for K classes) which can be seen here, where y_k=1 if x's label is k and 0 in any other case, and p_k(x) is the softmax function. However, I have made the first and second order gradient using this loss function and the hessian doesn't match the one defined in the code here (in function GetGradient in SoftmaxMultiClassObj) by a constant 2.

Could you please tell me which is the loss function used?

Thank you in advance.


Solution

  • The loss function used for multiclass is, as you suspect, the softmax objective function. As of now the only options for multiclass are shown in the quote below, the multi:softprob returning all probabilities instead of just those of the most likely class.

    “multi:softmax” –set XGBoost to do multiclass classification using the softmax objective, you also need to set num_class(number of classes)

    “multi:softprob” –same as softmax, but output a vector of ndata * nclass, which can be further reshaped to ndata, nclass matrix. The result contains predicted probability of each data point belonging to each class.

    See https://xgboost.readthedocs.io/en/latest//parameter.html#learning-task-parameters.