math machine-learning scikit-learn probability naivebayes

what does this arg max notation mean in the scikit-learn docs for Naive Bayes?

I'm referring to the following page on Naive Bayes:

http://scikit-learn.org/stable/modules/naive_bayes.html

Specifically the equation beginning with y-hat. I think I generally understand the equations before that, but I don't understand the "arg max y" notation on that line. What does it mean?

Solution

Whereas the max of a function is the value of the output at the maximum, the argmax of a function is the value of the input ie the "argument" at the maximum.

In the equation in your example:

y_hat is the value of y, ie the the class label, that maximizes the right hand expression.

Here P(y) is typically the proportion of class y in the training set, also called the "prior", and P(x_i | y) is the probability of observing the feature value x_i if the true class is indeed y, also called the "likelihood".

To understand the product P(x_i | y) better, consider an example where you are trying to classify a sequence of coin flips as coming from either coin A which lands heads in 50% of training examples, or coin B which lands heads in 66.7% of training examples. Here each individual P(x_i | y_j) is the probability of coin y_j (where j is either a or b) landing x_i (where x_i is either heads or tails).

Training set:

THH    A
HTT    A
HTH    A
TTH    A
HHH    B
HTH    B
TTH    B

Test set:

HHT    ?

So the sequence HHT has a 0.667*0.667*0.333 = 0.148 likelihood given coin B, but only a 0.5*0.5*0.5 = 0.125 likelihood given coin A. However we estimate a 57% prior for coin A since A appears in 4/7 training examples, so we would end up predicting that this sequence came from coin A, since 0.57*0.125 > 0.43*0.148. This is because we are more likely to start with coin A, so coin A has more chances to produce some less-likely sequences.

If the prior for coins A and B were 50% each, then we would naturally predict coin B for HHT, since this sequence clearly has the highest likelihood given coin B.