Search code examples
rggplot2bubble-chart

Bubble chart basics in ggplot


I am looking to understand the best way to plot bubble charts and it is surprisingly a bit more confusing than I expected.

Take a simple data frame:

library(tidyverse)

bubble_data <- data.frame(x = c(1, 1), y = c(1, 1), size = c(9, 16), type = c("A", "B"))

> bubble_data
  x y size type
1 1 1    9    A
2 1 1   16    B

If we make the size of the plotted bubbles equal to the size variable, then if done correctly, the areas of the plotted bubbles should be in the ratio of 9 to 16, or ~56%.

Plotting we get something that does not look anywhere close to a 56% ratio of the areas (and the bubbles are really small):

ggplot(bubble_data, aes(x = x, y = y, size = size, color = type)) +
  geom_point(alpha = 0.3)

bubble1

So we could create a new variable by scaling the underlying data by taking the square root of the original data (since area is proportional to the square of the radius), and then plot that:

 bubble_data_scaled <- bubble_data %>% 
  mutate(scaled_size = sqrt(size))

> bubble_data_scaled
  x y size type scaled_size
1 1 1    9    A           3
2 1 1   16    B           4

# plot with scaled size
ggplot(bubble_data_scaled, aes(x = x, y = y, size = scaled_size, color = type)) +
  geom_point(alpha = 0.3) 

Which yields a plot that looks pretty much the same as the previous plot

bubble2

Finally, we would use the scale_size_area or scaled size on the original bubble_data data set

ggplot(bubble_data, aes(x = x, y = y, size = size, color = type)) +
  geom_point(alpha = 0.3) +
  scale_size_area()

But this does not seem to yield the correct result either, and one of the bubbles looks a bit offset.

bubble3

Which brings me to my question: What is the correct way to plot a bubble chart where the areas of the bubbles would reflect the correct relative sizes of the underlying data, and how to scale the plot output so that the bubbles are easier to see?

Thanks!


Solution

  • To get bubbles scaled by size, use scale_size_area. To get them bigger change the max_size parameter.

    ggplot(bubble_data, aes(x = x, y = y, size = size, color = type)) +
      geom_point(alpha = 0.3) +
      scale_size_area(max_size = 20, breaks = (0:5)^2)
    

    enter image description here

    As for the offset appearance, that sounds like it's due to the rendering device not using anti-aliasing. This can make lines appear jagged, or shapes to look asymmetric at the pixel level. The default plot window device in windows, for instance, doesn't use anti-aliasing. You can add the ragg device and adjust the settings in RStudio to make that the default for the plot window. Or you can render using ggsave + ragg or cairo to get nice anti-aliasing.

    To get coordinate-scale control over the bubble size (ie to make a bubble "x units wide" in coordinate space, you could use ggforce::geom_circle.