Search code examples
pythonscikit-learngaussiannaivebayes

sklearn Gaussian Naive Bayes - why "Gaussian"?


I understand the Bayes theorem but don't understand what is the "Gaussian" part in the classifier. Why is it called "Gaussian"?


Solution

  • Consider the setting of sklearn.naive_bayes.GaussianNB. The fit method takes an x and a y, and tries to fit them. They correspond to instances of random variables X and y, and y takes some values c ∈ C. So, we can estimate f(X|C = c). Of course, we are interested in P(C = c|X). If you recall Bayes' Theorem,

    P(A | B) = P(B | A)P(A) / P(B),

    we need the apriori distribution of X for this reversal. In gaussian naive bayes, this is assumed to be a normal distribution.