Search code examples
rdataframetext-miningtmword-cloud

Wordcloud in R from list of values (not from text documents)


I have ranked the tokens in my texts according so a criterion and they all have a value. My list looks like this:

value,token
3,tok1
2.84123,tok2
1.5,tok3
1.5,tok4
1.01,tok5
0.9,tok6
0.9,tok7
0.9,tok8
0.81,tok9
0.73,tok10
0.72,tok11
0.65,tok12
0.65,tok13
0.6451231,tok14
0.6,tok15
0.5,tok16
0.4,tok17
0.3001,tok18
0.3,tok19
0.2,tok20
0.2,tok21
0.1,tok22
0.05,tok23
0.04123,tok24
0.03,tok25
0.02,tok26
0.01,tok27
0.01,tok28
0.01,tok29
0.007,tok30

I then try to produce wordcloud with the following code:

library(tm)
library(wordcloud)

tokList = read.table("tokens.txt", header = TRUE, sep = ',') 

# Create corpus
corp <- Corpus(DataframeSource(tokList))
corpPTD <- tm_map(corp, PlainTextDocument)

wordcloud(corpPTD, max.words = 50, random.order=FALSE)

Which produces:

enter image description here

But that is not what I want. I would like a wordcloud, where I visualize the tokens (so "tok1", "tok2", ...) according to the value that's in the table. So if the first token has a 3 then I want that word to be three times bigger than the next element in the list.

Can somebody maybe help?


Solution

  • Simply this will also work (assuming that your minimum value is not zero, if zero then filter out the corresponding tokens):

    library(RColorBrewer)
    wordcloud(tokList$token, tokList$value/min(tokList$value), max.words = 50, min.freq = 1, 
                        random.order=FALSE, colors=brewer.pal(6,"Dark2"), random.color=TRUE)
    

    enter image description here