Search code examples
rggplot2scale

Specify asymetrical breaks in `scale_fill_stepsn()`


I want to add a stepped gradient for the fill aesthetic, with specific colors for specific intervals.

For instance, I'd like red between 0 and 15, blue between 15 and 18, etc.

Here is the closest I've (painfully) come to my goal:

library(tidyverse)
val_breaks = c(0, 15, 18, 20, 25)
col_breaks = c("red", "blue", "yellow", "purple")

#4 intervals, 4 colors
cut(mtcars$qsec, breaks=val_breaks) %>% levels()
#> [1] "(0,15]"  "(15,18]" "(18,20]" "(20,25]"

mtcars %>%
  ggplot(aes(x = mpg, y = wt, color = qsec)) +
  geom_point(size = 3) +
  scale_color_stepsn(
    colors=col_breaks,
    limits=c(0,25),
    breaks=val_breaks,
    values=scales::rescale(val_breaks)
  )

ggplot output

Created on 2024-12-08 with reprex v2.1.1

As you can see, the colors are not really respected and mix with each other.

Is there a way?


Solution

  • You can achieve your desired result by passing the midpoints of the intervals to the values= argument and by setting the from= argument of rescale according to the limits= used for the scale:

    library(tidyverse)
    
    val_breaks <- c(0, 15, 18, 20, 25)
    col_breaks <- c("red", "blue", "yellow", "purple")
    
    midpoints <- val_breaks[-1] - diff(val_breaks) / 2
    
    mtcars %>%
      ggplot(aes(x = mpg, y = wt, color = qsec)) +
      geom_point(size = 3) +
      scale_color_stepsn(
        colors = col_breaks,
        limits = c(0, 25),
        breaks = val_breaks,
        values = scales::rescale(
          midpoints,
          from = c(0, 25)
        )
      )
    

    Explanation

    The reason is that a binned scale is still a continuous scale and the color palette is still a color gradient but the color assigned to all points in a bin is based on the midpoint of the interval or bin.

    With scales::rescale(val_breaks) you specified that the colors apply to the edges of the bins. But as the color for each bin is based on the midpoint you end up with a mix of the colors at the bins edges, i.e.

    pal <- scales::pal_gradient_n(
      col_breaks,
      scales::rescale(val_breaks),
      "Lab"
    )
    
    pal(scales::rescale(midpoints, from = range(val_breaks))) |>
      scales::show_col()
    

    Hence, to fix the issue we have to make sure that the colors apply to the midpoints:

    pal2 <- scales::pal_gradient_n(
      col_breaks,
      scales::rescale(midpoints, from = range(val_breaks)),
      "Lab"
    )
    
    pal2(scales::rescale(midpoints, from = range(val_breaks))) |>
      scales::show_col()