Search code examples
machine-learningmulticlass-classificationcatboost

How does CatBoost perform multiclass classification?


I am trying to figure out how CatBoost performs multiclass classification with MultiClass loss function. As I understand it, for each prediction MultiClass requires M values for each of M classes. My questions are:

  • How are those M values are obtained?

  • How are those M values are transferred to predicted probabilities?

My current hypothesis is that CatBoost builds separate binary classifier for each of M classes and then uses softmax function to get the predicted probabilities.

  • If this is the case, is every sequence of trees for individual classifiers the same or completely different?

Solution

  • For some other common GBMs, I've seen that they work as your hypothesis, building the one-vs-rest classifiers (completely different in general) and then at the end applying softmax to recover final predictions.

    But apparently CatBoost builds one set of multi-output trees:
    https://github.com/catboost/catboost/issues/1806