Search code examples
rtag-cloud

Preserve uppercase in tagcloud


I want to make a tag cloud to visualize the gene frequency.

library(wordcloud)

genes_snv <- read.csv("genes.txt", sep="", header=FALSE)

wordcloud(genes_snv$V1,
          min.freq=15,
          scale=c(5,0.5),
          max.words=100,
          random.order=FALSE,
          rot.per=0.3,
          colors=brewer.pal(8, "Dark2"))

This is my code, but it converts everything to lowercase (not useful with gene names). How can I avoid this?

genes.txt starts with

Fcrl5
Etv3
Etv3
Lrrc71
Lrrc71
(...)

Solution

  • When freq argument is missing wordcloud calls tm::TermDocumentMatrix, which I guess internally calls function tolower before computing frequency.

    To avoid calls to tm we can supply our own frequency, see example:

    # dummy data
    set.seed(1)
    genes <- c("Fcrl5","Etv3","Etv3","Lrrc71","Lrrc71")
    genes <- unlist(sapply(genes, function(i)rep(i, sample(1:100,1))))
    
    # get frequency
    plotDat <- as.data.frame(table(genes))
    
    # plot
    wordcloud(word = plotDat$genes, freq = plotDat$Freq,
              min.freq=15,
              scale=c(5,0.5),
              max.words=100,
              random.order=FALSE,
              rot.per=0.3,
              colors=brewer.pal(8, "Dark2"))
    

    enter image description here