So I am trying to (just for fun) classify movies based on their description, the idea is to "tag" movies, so a given movie might be "action" and "humor" at the same time for example.
Normally when using a text classifier, what you get is the class to where a given text belongs, but in my case I want to assign a text to 1 to N tags.
Currently my training set would look like this
+--------------------------+---------+
| TEXT | TAG |
+--------------------------+---------+
| Some text from a movie | action |
+--------------------------+---------+
| Some text from a movie | humor |
+--------------------------+---------+
| Another text here | romance |
+--------------------------+---------+
| Another text here | cartoons|
+--------------------------+---------+
| And some text more | humor |
+--------------------------+---------+
What I am doing next is to train classifiers to tell me whether or not each tag belongs to a single text, so for example, if I want to figure out whether or not a text is classified as "humor" I would end up with the following training set
+--------------------------+---------+
| TEXT | TAG |
+--------------------------+---------+
| Some text from a movie | humor |
+--------------------------+---------+
| Another text here |not humor|
+--------------------------+---------+
| And some text more | humor |
+--------------------------+---------+
Then I train a classifier that would learn whether or not a text is humor or not (the same approach is done with the rest of the tags). After that I end with a total of 4 classifiers that are
Finally when I get a new text, I apply it to each of the 4 classifiers, for each classifier that gives me a positive classification (that is, gives me X instead of no-X) if such classification is over a certain threshold (say 0.9), then I assume that the new text belongs to tag X, and then I repeat the same with each of the classifiers.
In particular I am using Naive Bayes as algorithm, but the same could be applied with any algorithm that outputs a probability.
Now the question is, is this approach correct? Am I doing something terribly wrong here? From the results I get things seems to make sense, but I would like a second opinion.
Yes, this makes sense. it is a well known, basic technique for multilabel/multiclass classification known as "one vs all" (or "one vs all") classifier. This is very old and widely used. On the other hand - it is also very naive as you do not consider any relations between your classes/tags. You might be interested in reading about structure learning, which covers topics where there is some structure over labels space that can be exploited (and usually there is).