I am trying to do some sentiment analysis on r/wallstreetbets content and would also like to use the meaning of emojis.
Here is my code:
from nltk.sentiment.vader import SentimentIntensityAnalyzer
wsb_lingo = {
"bullish": 4.0,
"bearish": -4.0,
"bagholder": -4.0,
"BTFD": 4.0,
"FD": 4.0,
"diamond hands": 0.0,
"paper hands": 0.0,
"DD": 4.0,
"GUH": -4.0,
"pump": 4.0,
"dump": -4.0,
"gem stone": 4.0, # emoji
"rocket": 4.0, # emoji
"andromeda": 0.0,
"to the moon": 4.0,
"stonks": -4.0,
"tendies": 4.0,
"buy": 4.0,
"sell": -4.0,
"hold": 4.0,
"short": 4.0,
"long": 4.0,
"overvalued": -4.0,
"undervalued": 4.0,
"calls": 4.0,
"call": 4.0,
"puts": -4.0,
"put": -4.0,
}
sid = SentimentIntensityAnalyzer()
sid.lexicon.update(wsb_lingo)
# Test
print(sid.polarity_scores('🚀'))
print(sid.polarity_scores('😄'))
The output is given below:
{'neg': 0.0, 'neu': 0.0, 'pos': 0.0, 'compound': 0.0}
{'neg': 0.0, 'neu': 0.0, 'pos': 0.0, 'compound': 0.0}
How is it possible that it's unable to give any sentiment for emojis (e.g., due to Jupyter Notebook)? Am I forgetting something here? All libraries are up-to-date.
If I use vaderSentiment
instead of nltk.sentiment.vader
it works for me
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
new = { "rocket": 4.0 }
sia = SentimentIntensityAnalyzer()
sia.polarity_scores('🚀')
# Outputs: {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
sia.lexicon.update(new)
sia.polarity_scores('🚀')
# Outputs: {'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.7184}
See also this issue