I'm working on classifying invoices and receipt and I will be working with Bernoulli model.
This is the naive Bayes classifier :
P(c|x) = P(x|c) x P(c) / P(x)
I know how to compute P(c) class prior probability and since we assume that all the words are independent we don't need the P(x).
Now formula will be like this : P(c|x) = P(x|c) x P(c) and to compute P(x|c) we do the liklihood method which is calculating all the words probability P(c|X) = P(x1|c)P(x2|c)*P(x3|c)....
My question is after calculating the liklihood do I need to multiply it with P(c) or not, P(c|X) = P(x1|c)P(x2|c)*P(x3|c)...*P(c)?
P(c|x)
is not equal to P(x|c) P(c)
. It is proportional, as during classification you do
cl(x) = arg max_c P(c|x) = arg max_c P(x|c) P(c) / P(x) = arg max_c P(x|c) P(c)
and this holds for every probability distribution, where P(x)>0
, no need to any Bayes assumptions at this point. It is just a simple Bayes theorem + noticing that P(x)
is just a positive constant in this equation.
Thus you never actually compute P(c|x)
, you just compute P(x|c) P(c)
which will give you the same classification. I hope this shows that your classification has to be based on product of P(x|c)
and P(c)
, where as you pointed out P(x|c) = PROD_i P(x_i|c)
(here we use Naive Bayes assumption regarding independence, not before).