Search code examples
pythonmachine-learningscikit-learnprobabilitymulticlass-classification

Interpreting predict_proba, multinomial Naive Bayes


I used MultinomialNB() from scikit-learn. Using predict_proba, how can I interpret these probabilites? My initial guess was: a probability of 0.8 means that the classifier is 80% certain that class X is the right class.

I found a related question, but no answers were provided.


Solution

  • Your intuition is correct. As you can read in documentation, predict_proba returns the probability of the samples for each class in the model. So, if we assume that you have 4 classes in your trained model and predict_proba returns [0.6, 0.2, 0.19, 0.01] (it's always sums up to 1) it is saying that your data is first class with 60% probability, second with 20% etc.

    Documentation: https://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.MultinomialNB.html