Search code examples
rggplot2cluster-analysis

create a bubble plot (or something similar) from cluster analysis in R


The following data frame is an example of a result from a clustering analysis and I would like to create a visualization. I'm thinking of a bubble plot that takes the column total into account, but it should also print the value for var1. No idea how to do this and unfortunately I cannot add the expected result (basically because I'm not sure whtat I'm looking for ;-). So basically, for each cluster, I would like a bubble that scales with the variable total and that bubble should also contain the values from var1.

set.seed(1)
df <- data.frame(cluster = rep(1:3,5),
                  var1 = sample(rep(LETTERS[1:10]),15,replace =TRUE),
                  total = sample(30:300,15, replace=TRUE))

Solution

  • plotly is a great tool for making graphs interactive. Instead of writing text to the bubble, which can be hard to read, plotly allows for you to hover over points and display text. Try something like:

    library(ggplot2)
    library(plotly)
    
    set.seed(1)
    df <- data.frame(cluster = rep(1:3,5),
                      var1 = sample(rep(LETTERS[1:10]),15,replace =TRUE),
                      total = sample(30:300,15, replace=TRUE))
    
    plt <- ggplot2::ggplot(df, aes(x=cluster, y=var1, size = total, text=paste('</br>var1: ',var1,'</br>total: ',total))) +
        geom_point(alpha=0.7)
    
    ggplotly(plt)
    

    enter image description here