I need to know which loss functions are used in the h2o gbm and xgboost functions for the gaussian, binomial and multinomial distributions. Unfortunately, my knowledge of Java is very limited and I can't really decipher the source code, and there doesn't seem to be any document clarifying which distribution is associated with which function. I think I gather from here that it's logloss for binomial and MSE for gaussian, but I can't find anything for multinomial. Does anybody here maybe know the answer?
Thank you for your question. We definitely should provide this information in the documentation. We are working on improving the doc. To answer your question:
The loss function for multinomial classification is softmax for H2O GBM and XGBoost too. H2O GBM is implemented based on this paper: Greedy function approximation: A gradient boosting machine, Jerome H. Friedman 2001. In chapter 4.6. the author nicely explains how it is calculated and why.
Based on loss function the negHalfGradient
method is defined and every distribution implements it individually. For multinomial distribution (here) the implementation is:
@Override
public double negHalfGradient(double y, double f, int l) {
return ((int) y == l ? 1f : 0f) - f;
}
Where:
y
is an actual responsef
is a predicted response in link spacel
is a class label (converted lexicographically from original labels to 0-number of class - 1)Let me know if you have other questions.