Search code examples
rquanteda

Tokenize Text and Analyze with Dictionary in Quanteda


I am trying to do a text analysis using the quanteda packages in R and have been successful in gaining the desired output without doing anything to my texts. However, I am interested in removing stopwords and other common phrases to rerun the analysis (from what I am learning in other sources -- this process is called "Tokenizing"(?)). (The instructions are from https://data.library.virginia.edu/a-beginners-guide-to-text-analysis-with-quanteda/)

With the processed text, which I was able to do using the instructions and the quanteda package. However, I am interested in applying a dictionary for analyzing the text. How can I do that? Since it is hard to attach all my documents here, any hints or examples that I can apply would be helpful and greatly appreciated.

Thank you!


Solution

  • i have used this library with great success and then merged by word to get the score or sentiment. Merge by word

    library(tidytext)
    
    get_sentiments("afinn")
    get_sentiments("bing")
    

    you can save it as a table

    table <- get_sentiments("afinn")
    
    total <- merge(data frameA,data frameB,by="ID")