Search code examples

Removing all stopwords except "you", "your's", "me", "mine"

I am trying to remove all english stopwords, except "you/your's", "me/mine" because those are important to concider for my analysis. Can someone please help me with this issue? I am very new to R, so I know that I remove stopwords with the following code:

corpus <- tm_map(corpus, removeWords, stopwords("english"))

... but I have no clue about how to keep the words I need


  • You can exract the strings from stopwords("english") and remove the strings you wish to keep so that they won't be excluded. Here is an example with the dplyr grammar.

    words_to_keep <- c("me","mine","your","yours")
    my_stopwords <- data.frame(words = stopwords("english"))%>% #make into dataframe
      filter(!(words %in% words_to_keep))%>% #filter to exclude the words you want to keep
      pull() #transform it back into a vector of strings 
    corpus <- tm_map(corpus,removeWords,my_stopwords)