Search code examples
rggplot2plotscalegeom-point

How can I set the point size in a ggplot2 scatterplot to match the scale of the axes?


I am trying to make a scatterplot using ggplot2 in which the diameter of the points is of the same dimensions as the variables on the axes and should have the same scale. This problem is laid out well in this question as well as this one, which was resolved by drawing ellipses on the graph (nowadays done with geom_circle from ggforce. However, for my application I need to draw thousands of points, which is quick using geom_point but very slow using geom_circle. Is there a way to scale geom_point to the scale of the axes?

As an example of the problem, this graph shows the discrepancy in scales using scale_radius:

x <- runif(20, 0, 20)
y <- runif(20, 0, 20)
radius <- runif(20, 0, 4)
df <- data.frame(x = x, y = y, size = radius)

library(ggplot2)

p <- ggplot(
    data = df,
    mapping = aes(
      x = x,
      y = y,
      size = radius
    )
  ) + 
  geom_point() + 
  coord_fixed() + xlim(0, 20) + ylim(0, 20) + 
  scale_radius(range = c(min(radius), max(radius)))

p

enter image description here

I have tried using scale_radius and scale_continuous, but both use a scale that is arbitrary with relation to the axis scales (scale_radius also does not scale such that a point of size 0 displays with size 0). I had the idea of accessing the plot size using ggplot_build and scaling the point sizes accordingly. I can access the plot range using ggplot_buil(p)$layout$get_scales(i=1) or layer_scales(p), but no variables appear to correspond to the size of the plot in the units that scale_radius uses.


Solution

  • Using 2000 circles and max radius of 1 (ie max diameter 2), I get a ~5-7x speedup using a lower poly resolution per circle. You might also look at your output device and try ragg, which is faster than cairo still offers nice anti-aliasing.

    ggplot(df, aes(x0 = x, y0 = y, r = radius)) +
      ggforce::geom_circle(n = 20) + #5-7x faster than default
      coord_fixed()
    

    Still looks pretty good, with 1.5 sec render time on my system.

    enter image description here

    (You might also consider defining your plot window range using coord_fixed(xlim = c(0,20), ylim = c(0,20)) since that will have the effect of zooming in on that viewing window instead of cropping out all data points out outside it, as your xlim() and ylim() (shortcuts for scale_x_continuous(limits = ...) and scale_y_continuous(limits = ...). It's not an issue for geom_point but for geom_circle your approach will result in cut-off circles.)