I found a link about multinomial naive bayes classifier
How we could calculate the B'
or |V|
?
The page said that it is the number of terms in the vocabulary. In its example, how we could get 6
for B
? Is it the counting of all term?
"chinese", "beijing", "shanghai", "meacao", "tokyo", "japan"
One more question, what if new term appear in testing document? example, in doc 6 appears "bangkok" or any new word that never appear before. how to count the probability of new term ?
You are right. It's the total number of words in the vocabulary, since there can be only one entry for a term in the vocabulary.