I'm using Swift (even if my question is not about language) and Python to test my ML logic. I have training data:
("add a new balloon", "add-balloon")
("add a balloon", "add-balloon")
("get last balloon", "get-balloon")
("update balloon color to red", "update-balloon")
When I try use Naive Bayes to classify a new sentence like
classify("could you add a new balloon")
// Return add-balloon
classify("could you update the balloon color")
// Return add-balloon
classify("update the balloon color")
// Return add-balloon
My data set has a lot of observations about adding a balloon (about 50) but not a lot to update or get (about 5-6). Is Naive Bayes sensitive to the number of training observations? I don't understand why the classification is not performing well even if given a sentence it saw during training.
Naive Bayes is sensitive to class priors (distribution of examples among classes). So if you have way more add-balloon
than other categories, it will have a bias towards this class. It is normally helpful since suppose you don't know anything (no posterior information), your best bet is to try the class which is the most likely.
If your distribution is heavily skewed, you data sets are not large, your documents are short or lack very informative words (or contains many ambiguous ones) though, this can cause undesired results such as what you are reporting.