I'm looking for a Multinomial Naive Bayes classifier written in C/C++ for use with OpenCV.
I'm looking for the Algorithm (or a readymade implementation) as it will be more helpful as I'm trying to understand on how it works?
Naive Bayes Classifier is a well-known classification algorithm. especially in the field of text classification, so I will take it for explaining.
Assuming we have some training document {d1 , d2 , d3 , ... , dm}
where each document can be represented by a collection of words {w1,w2,w3, ... , wn}
and each document belongs to some predefined set of class (take binary case (c_0,c_1)
here)
Our task is to classify some new input document d into either class c_0
or class c_1
.
An intuitive way would be to take maximum likelihood estimation: that is,
output c_0 if P(d | c_0) > P(d | c_1) and vice versa.
so by our definition of d, we can write the criterion by
P(d | c_0) = P( {w1,w2,w3...,wn} | c_0)
since calculating this joint probability given class is so complicated. So we make a strong assumption that words are mutually independent conditioned on class. So that leads us to
P(d | c_0) = P({w1,w2,w3...,wn} | c_0) = P(w1|c_0)*P(w2|c_0)*P(w2|c_0)...*P(wn|c_0)
where each P(w | c)
can be easily computed as frequency count of word w in class c.
this strong assumption is the reason for the name "Naive", since we just naively do series multiplication for each word.
finally taking answer = argmax P(d | c_0) , P(d | c_1)
would end this algorithm
I guess in your domain what you're looking is similar to text classification, except the feature you need to extract is different.