Search code examples
rtext-miningdata-analysissentiment-analysissentimentr

Sentiment Analysis in R using TDM/DTM


I am trying to apply a sentiment analysis in R with the help of my DTM (document term matrix) or TDM (term document matrix). I could not find any similar topic in the forum and on google. Thus, I created a corpus and from that corpus I generated a dtm/tdm in R. My next step would be to apply the sentiment analysis which I need later for stock prediction via SVM. My give code is that:

    dtm <- DocumentTermMatrix(docs)
    dtm <- removeSparseTerms(dtm, 0.99)
    dtm <- as.data.frame(as.matrix(dtm))

    tdm <- TermDocumentMatrix(docs)
    tdm <- removeSparseTerms(tdm, 0.99)
    tdm <- as.data.frame(as.matrix(tdm))

I read that it is possible through the tidytext package with the help of the get_sentiments() function. But it was not possible to apply that with a DTM/TDM. How can I run a sentiment analysis for my cleaned filter words which are already stemmed, tokenized etc.? I saw that a lot of people did the sentiment analysis for a hole sentence, but I would like to apply it for my single words in order to see if they are positive, negative, score etc. Many thanks in advance!


Solution

  • SentimentAnalysis has good integration with tm.

    library(tm)
    library(SentimentAnalysis)
    
    documents <- c("Wow, I really like the new light sabers!",
                   "That book was excellent.",
                   "R is a fantastic language.",
                   "The service in this restaurant was miserable.",
                   "This is neither positive or negative.",
                   "The waiter forget about my dessert -- what poor service!")
    
    vc <- VCorpus(VectorSource(documents))
    dtm <- DocumentTermMatrix(vc)
    
    analyzeSentiment(dtm, 
      rules=list(
        "SentimentLM"=list(
          ruleSentiment, loadDictionaryLM()
        ),
        "SentimentQDAP"=list(
          ruleSentiment, loadDictionaryQDAP()
        )
      )
    )
    #   SentimentLM SentimentQDAP
    # 1       0.000     0.1428571
    # 2       0.000     0.0000000
    # 3       0.000     0.0000000
    # 4       0.000     0.0000000
    # 5       0.000     0.0000000
    # 6      -0.125    -0.2500000