python machine-learning scikit-learn naivebayes

what is the expected result for this naive bayes multinomial model code

what should be the expected result

when i calculated manually i got P(y=1|x=1) > P(y=0|x=1). But the model is predicting output is 0.

from sklearn.naive_bayes import GaussianNB,MultinomialNB
xx = [[1],[1],[1],[2],[2],[3]]
yy = [1,1,1,0,0,0]
# clf = GaussianNB()
clf = MultinomialNB()
clf.fit(xx,yy)
print(clf.predict([[1]]))

i also tried changing alpha parameter from 1 to 1000. the output is still 0 for input = 1.

Solution

For multinomial naive Bayes, the model assumes features to be counts from a multinomial distribution. The following code should make this clear:

import numpy as np
from sklearn.naive_bayes import MultinomialNB
from sklearn.preprocessing import MultiLabelBinarizer

xx = [[1],[1],[1],[2],[2],[3]]
yy = [1,1,1,0,0,0]

mlb = MultiLabelBinarizer()
xxtransformed =  mlb.fit_transform(xx)
print(xxtransformed)
# [[1 0 0]
# [1 0 0]
# [1 0 0]
# [0 1 0]
# [0 1 0]
# [0 0 1]]

clf = MultinomialNB()
clf.fit(xxtransformed,yy)
print(mlb.transform(np.array([[1]])))
#[[1 0 0]]
print(clf.predict(mlb.transform(np.array([[1]]))))
#[1]

And indeed, we get the expected prediction of 1