Search code examples
rggplot2tidyversealphaaesthetics

Why levels which are not there in data, are mapped with alpha in ggplot2?


I am observing a strange behaviour. Mapping cyl variables (from mtcars data) is producing the chart with 5 different levels of alpha despite the fact that only three levels are available there?

Is that a bug? Or I am missing something?

library(tidyverse)
count(mtcars, cyl)
#>   cyl  n
#> 1   4 11
#> 2   6  7
#> 3   8 14

ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point(aes(alpha = cyl), size = 4)

Created on 2024-08-24 with reprex v2.1.1


Solution

  • The reason is that cyl is a continuous variable and by default the number of breaks for a continuous scale is set using scales::breaks_extended (not 100% sure that this applies to all cases, though (: ) which also by default will return approx. n=5 breaks.

    library(ggplot2)
    
    scales::breaks_extended()(mtcars$cyl)
    #> [1] 4 5 6 7 8
    

    The simple approach to fix that would be to convert to a factor:

    ggplot(mtcars, aes(x = wt, y = mpg)) +
      geom_point(aes(alpha = factor(cyl)), size = 4)
    #> Warning: Using alpha for a discrete variable is not advised.
    

    Or set the breaks explicitly via the scale:

    ggplot(mtcars, aes(x = wt, y = mpg)) +
      geom_point(aes(alpha = cyl), size = 4) +
      scale_alpha_continuous(breaks = sort(unique(mtcars$cyl)))