I'm working on homework for my machine learning course and am having trouble understanding the question on Naive Bayes. The problem I have is a variation of question number 2 on the following page:
https://www.cs.utexas.edu/~mooney/cs343/hw3-old/hw3.html
The numbers I have are slightly different, so I'll replace the numbers from my assignment with the example above. I'm currently attempting to figure out the probability that the first text is physics. To do so, I have something that looks a little like this:
P(physics|c) = P(physics) * P(carbon|physics) * p(atom|physics) * p(life|physics) * p(earth|physics) / [SOMETHING]
P(physics|c) = .35 * .005 * .1 * .001 * .005 / [SOMETHING]
I'm basing this off of an example that I've seen in my notes, but I can't seem to figure out what I'm supposed to divide by. I'll provide the example from the notes as well.
Perhaps I'm going about this in the wrong way, but I'm unsure where the P(X) term that we're dividing by is coming from. How does this relate to the probability that the text is physics? I feel that getting this issue resolved will make the remainder of the assignment simple.
The denominator P(X)
is just the sum of P(X|Y)*P(Y)
for all possible classes.
Now, it's important to note that in Naive Bayes, you do not have to compute this P(X)
. You only have to compute P(X|Y)*P(Y)
for each class, and then select the class that produced the highest probability.
In your case, I assume you must have several classes. You mentioned physics
, but there must be others like chemistry
or math
.
So you can compute:
P(physics|X) = P(X|physics) * P(physics) / P(X)
P(chemistry|X) = P(X|chemistry) * P(chemistry) / P(X)
P(math|X) = P(X|math) * P(math) / P(X)
P(X)
is the sum of P(X|Y)*P(Y)
for all classes:
P(X) = P(X|physics)*P(physics) + P(X|chemistry)*P(chemistry) + P(X|math)*P(math)
(By the way, the above statement is exactly analogous to the example in the image that you provided. The equations are a bit complicated there, but if you rearrange them, you will find that P(X) = P(X|positive)*P(positive) + P(X|negative)*P(negative)
in that example).
To produce the answer (that is, to determine Y
among physics
, chemistry
, or math
), you would select the maximum value among P(physics|X)
, P(chemistry|X)
, and P(math|X)
.
As I mentioned, you do not need to compute P(X)
because this term exists in the denominator of all of P(physics|X)
, P(chemistry|X)
, and P(math|X)
. Thus, you only need to determine the max among P(X|physics)*P(physics)
, P(X|chemistry)*P(chemistry)
, and P(X|math)*P(math)
.