Search code examples
ibm-cloudibm-watsonnl-classifier

Watson Natural Language Classifier - using a scale for classes


When using Watson's NLC API, can a scale be used for the classes? For example, a 1-5 rating or a Yes/No/Maybe classification?

My training data consists of a list of news headlines. For each, I have a "class" of not interesting, somewhat interesting, and very interesting. I want to predict whether it's a headline that would be interesting to the reader based on what they found interesting in the past. Because it feels like more of a regression model that's predicting a number between 1 and 3, I wonder if the classifier would work correctly for this application. Thoughts?


Solution

  • Yes, you can use a 1-5 rating (using 5 categories) and if it will work or not, it's hard to tell, because it depends on your data :-)

    But it's a completely valid approach.

    What NLC will do behind the scenes is to extract meanings from each text sample, calculating some semantic distance using an internal wikipedia -based ontology, and it will try to create a classifier based on the concepts of each sample text.

    So, using 5 categories will work if, in your text examples, there are intrinsic semantic differences between each cluster, so the classifier can correctly gather what is related and put apart what is different.

    The same logic was used here, in this recipe , using Watson Image classifier instead of NLC, but the logic is the same.