I did a sentiment analysis using VADER and now want to classify the values with negative, positive and neutral.
Positive when compound score is > 0.05
Negative when its < - 0.05 neutral when in between -0.05 and 0.05
df_polarity$VADER_Sent = ifelse(df_polarity$VADER_Sent > 0.05, "pos",
ifelse (df_polarity$VADER_Sent < -0.05, "neg",
ifelse (between(df_polarity$VADER_Sent, -0.05, 0.05) , "neu", "NA")
)
)
When running this code, even values with - 0.4XXX will be classified as neutral and not as negative.
For some reason this won't work. There is anything I am missing... but I can figure out what it is...
I couldn't find any helpful tipps by googling it.
I hope someone of you can help me with this one!
Output from str(df_polarity):
$ VADER_Sent : chr "0.0" "-0.4939" "0.7717" "0.7096"
After further looking into my data, it seems that the "-" sign is not recognized in the context of a negative number.
Thanks to everyone who tried to help me! Really appreciated it!!!
The problem is because the VADER_Sent
column is character. The comparisons <
and >
are checking alphabetically instead of numerically.
Example:
> -0.4939 < -0.05
[1] TRUE
> "-0.4939" < "-0.05"
[1] FALSE
Try using as.numeric(df_polarity$VADER_Sent)
in your ifelse()
statements to get around this.