Search code examples
ruppercaselowercase

R: Keep Upper Case with TermDocumentMatrix


I want to create a wordcloud with the wordcloud package. My problem is that I want to keep the upper case at the beginning of the words but all letters are automatically transformed to lower cases.

As far as I see, this happens when I use the TermDocumentMatrix function. Is there a possibility to prevent the function from transforming all letters to lower cases?


Solution

  • You can prevent TermDocumentMatrix from converting everything to lower case, by specifying tolower=FALSE in your control list. Since you do not provide any data, I will illustrate with the sample data provided in the tm package.

    library(wordcloud)
    library(tm)
    data(crude)
    
    tdm = TermDocumentMatrix(crude, 
        control=list(removePunctuation=T, tolower=F, stopwords=T))
    WordFreq = slam::row_sums(tdm[tdm$dimnames$Terms, ])
    FrequentWords = tail(sort(WordFreq), 20)
    wordcloud(names(FrequentWords), FrequentWords)
    

    Word Cloud