Search code examples
rggplot2plotgraphfrequency

Editing ggplot2 Bubble Plot in R: Size, colors, and labels


I am trying to do a few things with this bubble plot.

  1. I'd like to add a label for the frequency of occurences in each bubble on my plot
  2. Change the color scheme and theme of this graph, including the sizes of the bubbles (after adding in after_stat, I was unable to add this in. after_stat seems to be what is allowing me to chart frequencies in color and size)
  3. Change the title of the legend from "n" to anything else.
p<-ggplot(Dataframe, aes(x = Confidence, y = Knowledge)) + geom_count(aes(color = after_stat(n), size=after_stat(n))) +guides(color = 'legend') + labs(x="Confidence", y="Knowledge") + theme(axis.title = element_text(size = (15), colour = "black"), 
  axis.text = element_text(colour = "black", size = (12))) 

enter image description here


Previously, I tried this:

I was only able to make the bubbles one color (but whatever color I wanted-just not more than 1), but the bubbles could be made bigger:

p<-ggplot(Dataframe, aes(x = Confidence, y = Knowledge)) + geom_count(color="blue") + scale_size(range= c(1,20), name = "Co-occurring")+ labs(x="Confidence", y="Knowledge") + theme(axis.title = element_text(size = (15), colour = "black"), 
  axis.text = element_text(colour = "black", size = (12)))

enter image description here


Solution

  • You can add count labels with geom_text using stat = "sum" The size of the bubbles is changed with the range argument of scale_size_continuous. For a labeled bubble chart, I would tend to use shape = 21 for the bubbles and control their color with the fill aesthetic. This can be specified with whatever colors you choose using scale_fill_gradientn. To get the name on the legend changed, add the same name as the first argument to the fill and size scales.

    As an aside, the order of the items on the x axis and y axis are wrong. You need to fix this by converting Knowledge and Confidence to factors with the correct levels:

    library(tidyverse)
    
    Dataframe %>%
      mutate(across(any_of(c("Confidence", "Knowledge")), 
                    ~ factor(.x, c("Not at all", "A little", "Somewhat",
                                   "Very", "Extremely")))) %>%
      ggplot(aes(x = Confidence, y = Knowledge)) + 
      geom_count(aes(fill = after_stat(n), size = after_stat(n)), shape = 21) +
      geom_text(stat = "sum", aes(label = after_stat(n)), size = 5) +
      scale_fill_gradientn("Observations", guide = "legend",
                           colors = c("pink", "white", "lightblue")) +
      scale_size_continuous("Observations", range = c(6, 15)) +
      theme_minimal(base_size = 5 * .pt)
    

    enter image description here


    Data used

    Without reproducible data, I have had to construct a random data set with the same names and essential structure as your own:

    set.seed(123)
    
    Dataframe <- data.frame(Confidence = sample(c("A little",
                                                  "Somewhat",
                                                  "Very",
                                                  "Extremely"), 100, TRUE),
                            Knowledge  = sample(c("Not at all", 
                                                  "A little",
                                                  "Somewhat",
                                                  "Very",
                                                  "Extremely"), 100, TRUE))