Search code examples
stanford-nlpsentiment-analysis

Sentiment Analysis in Spanish with Stanford coreNLP


I'm new here and wanted to know if anyone can help me with the following question.

I'm doing sentiment analysis of text in Spanish and using Stanford CoreNLP but I can not get a positive result.

That is, if I analyze any English text analyzes it perfect to put it in Spanish but the result is always negative

I've been looking how to configure the parser in Spanish, tokenize and everything I found was useless for sentiment analysis.

Someone can tell me if the only thing that works is the tokenize and sentiment does not in Spanish?

This is my properties file so that I managed to find:

annotators = tokenize, ssplit, pos, ner, parse, sentiment

tokenize.language = en

pos.model = edu / stanford / nlp / models / pos-tagger / english / spanish-distsim.tagger

ner.model = edu / stanford / nlp / models / ner / spanish.ancora.distsim.s512.crf.ser.gz ner.applyNumericClassifiers = false ner.useSUTime = false

parse.model = edu / stanford / nlp / models / lexparser / spanishPCFG.ser.gz

The code to perform sentiment analysis is typical that you can find in any tutorial

Thank you very much!!


Solution

  • Unfortunately there is no Stanford sentiment model available for Spanish. At the moment all the Spanish words are likely being treated as generic "unknown words" by the sentiment analysis algorithm, which is why you're seeing consistently bad performance.

    You can certainly train your own model (documented elsewhere on the Internet, I believe..), but you'll need to have Spanish training data to accomplish this.