python machine-learning predict multiclass-classification lightgbm

LightGBM Multi-classification prediction result

I tried to build a multi-classification model using lightGBM. After training the model, I parsed some data online and put it into my model for prediction.

However, the result seems weird to me. I thought the nested array of my prediction means the probability for each class(I got 4 classes). The result of x-test (the data I used for validation) seems right. But the result of the data I scraped seems weird. It doesn't add up to 1.

In this post multiclass-classification-with-lightgbm, the prediction result didn't add up to 1 as well!

The 2 dataframes look the same to me, and I am using exactly the same model! Can someone pls tell me how to interpret the result or what have I done wrong?

Solution

For multi-class classification, when the classes are not mutually exclusive, the sum of probabilities may not equal to one. Say for example you are classifying dog, cat, and bird in images but your model is shown a car image, the probabilities for the three classes should be low and not equal to 1. you need rescale the predictions using this formula if you want to force the sum of probabilities to sum to 1.

On the other hand, when you have a classifier of type 1 vs others, for example, the images can only be a cat, a dog, or a bird. In this case, the classes are mutually exclusive, and the probabilities should sum to 1.

More references ref1,ref2,ref3