Search code examples
rsentimentr

sentimentr - different results for different text partitioning


Using sentimentr to analyse the text:

I haven’t been sad in a long time. I am extremely happy today. It’s a good day.

I first used a sentence by sentence partitioning of the text

library(sentimentr)

ase1 <- c(
  "I haven't been sad in a long time.",
  "I am extremely happy today.",
  "It's a good day."
)

part1 <- get_sentences(ase1)
sentiment(part1)

   element_id sentence_id word_count sentiment
1:          1           1          8 0.1767767
2:          2           1          5 0.6037384
3:          3           1          4 0.3750000

then used one block of text

ase2 <- c(
  "I haven’t been sad in a long time. I am extremely happy today. It’s a good day.")

part2 <- get_sentences(ase2)
sentiment(part2)

   element_id sentence_id word_count   sentiment
1:          1           1          9 -0.03333333
2:          1           2          5  0.60373835
3:          1           3          5  0.33541020

Same text, difference in word count and in sentiment score

Please advise?


Solution

  • Not completely the same text. In the first example you use ', but in the second text you use . These are completely different quotes and have different meaning in text mining.

    The example below returns the same results as in your first example.

    ase2 <- c(
      "I haven't been sad in a long time. I am extremely happy today. It's a good day.")
    
    part2 <- get_sentences(ase2)
    sentiment(part2)
       element_id sentence_id word_count sentiment
    1:          1           1          8 0.1767767
    2:          1           2          5 0.6037384
    3:          1           3          4 0.3750000