Search code examples
rwordcloud2

R wordcloud2 not putting the most frequent words on edges


I would like to use the wordcloud2 function in R on my dataset. The demo works nice: demo image from the library docs

But my dataset centers the small text in the middle. Any ideas/suggestions (includning other R libraries) welcomed:)

my dataset

Thank you very much Vladimir Vinarsky The curious Mechanobiologist

I expected my higherst frequency words in the middle, not in the edges. Tried to

  1. manipulate the frequencies distribution by squaring it or making a power of 3
  2. tried more shapes

Solution

  • Words at the top of your data frame are plotted centrally; those at the bottom are plotted peripherally, so if you want the big words in the middle, sort your data frame accordingly.

    For example, let's generate a data frame of computing terms with a random frequency column:

    my_data <- data.frame(word = c("Algorithm", "Function", "Variable",
      "Loop", "Object", "Class", "Inheritance", "Interface", "Array", 
      "String", "Integer", "Boolean", "Compiler", "Debugger", "Syntax",
      "Exception", "Library", "Framework", "API", "Database", "Query", 
      "Server", "Client", "Protocol", "Encryption", "Binary", "Source",
      "IDE", "Repository", "Recursion", "Data", "Pointer", "Stack", "Queue", 
      "Tree", "Graph", "Hash", "Encryption", "Bit", "Byte", "Bandwidth", 
      "Cache", "Cloud", "Compiler", "Constant", "Debug", "Deployment",
      "DNS", "Domain", "Email", "Firewall", "Gateway", "Git", "Hardware",
      "HTTP", "HTTPS", "IP Address", "JSON", "Kernel", "LAN", "Metadata",
      "Multithreading", "Network", "Node", "Packet", "Patch", "Pixel",
      "Platform", "Plugin"))
    
    set.seed(1)
    my_data$freq <- round(rnorm(nrow(my_data), 10, 3)^2)
    

    If I plot without ordering at all, I get a fairly random distribution of sizes throughout the cloud:

    wordcloud2::wordcloud2(my_data, size = 0.3)
    

    enter image description here

    If I sort from small to large, I can replicate your issue with the smaller words appearing in the middle of the image:

    wordcloud2::wordcloud2(my_data[order(my_data$freq),], size = 0.3)
    

    enter image description here

    If I sort in reverse order, I get the desired output, with the larger words drawn in the middle before smaller words are added around and between the larger words:

    wordcloud2::wordcloud2(my_data[rev(order(my_data$freq)),], size = 0.3)
    

    enter image description here