Search code examples
ibm-watsonalchemyapi

Watson/Alchemy sentiment analysis mislabeled as negative in some cases


I'm using the Watson/Alchemy Sentiment Analysis API, and have found some articles that are being labeled negative, when the articles are arguably positive. This happens when the articles are discussing good or beneficial decreases.

For example, this Washington Post article, "We’ve had a massive decline in gun violence in the United States. Here’s why." When submitted to the API, it returns a score of -0.4, even though the article is quite optimistic! (The article argues that gun violence has fallen signficantly.)

Another example is this article from CoreLogic, "CoreLogic Reports 38,000 Completed Foreclosures in January 2016." The API returns a document sentiment score of -0.27, even though the text is positive: "...the foreclosure inventory declined by 21.7 percent and completed foreclosures declined by 16.2 percent compared with January 2015. The number of completed foreclosures nationwide decreased year over year from 46,000 in January 2015 to 38,000 in January 2016."

Is there an established workaround for addressing this issue? Specifically, we wouldn't want to damage the credibility of the service and thus our results when a careful reader would assess the sentiment of articles like these quite differently than the API suggests. I'm looking for something that would allow me to modify sentiment results for specific cases (e.g. "decrease in foreclosures" is positive, as is "decrease in homicides").


Solution

  • I believe that this is quite normal :-) it's very rare a sentiment analysis algorithm that can give you the right answer in 100% of the results :-) I am not aware of the algorithm implementation, but my bet is that the sentiment is calculated from the "sentiment" of expressions and words. For example, most probably "gun", "violence" are probably related to negative sentiments, but Watson may have failed to understand that they were related to "massive decline" (even "decline" may have a negative sentiment).

    Even state of art sentiment analysis algorithms can reach something about 85~90% accuracy for very specific domains. So it's always important to set your expectations based on that.