I'm trying to implement Naive Bayes Classifier in python for the last few days with the iris data set from UCI (http://archive.ics.uci.edu/ml/datasets/Iris). When trying to classify 100 random samples i get only 30-40% accuracy. I think my probability function is right because I tested it with the example from wikipedia (https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Examples)
Now here's what I do:
Then for 100 random samples I:
Calculate the posterior numerator by multiplying each probability for that class
Store the values in a list and get the index of the highest value
compare the highest value index to the real index (Check if prediction is right)
And somehow I only get 30-40%, am I doing something wrong?
If you want to see the code, it's here: http://pastebin.com/sUYm97qi
LOL -- you wrote very concise/clean code so I was very confused until I saw the very end.
You were comparing classes[max_index]
, the class name of the prediction with y[max_index]
, the max index-th instance label value.
Try changing your code to
if(classes[max_index] == y[q]):
corr += 1
You should get around 96%