Search code examples
rggplot2colorsscale

R: How to manually set binned colour scale in ggplot?


I'm trying to set a colour scale for a ggplot dotplot, demarking zones that are: unsuitable, suitable, ideal, suitable, unsuitable. Within your standard ggplot code wrapper, I have tried gradient2

scale_colour_gradient2(low = "grey",
                       mid = "red",
                       high = "grey",
                       midpoint = 25) +

gradient2

and gradientn;

scale_colour_gradientn(colours = rev(rainbow(5)),
                       breaks = seq(0, 30, by = 5),
                       limits = c(0, 30),
                       labels = as.character(seq(0, 30, by = 5))) +

gradientn

stepsn,

scale_colour_stepsn(colours = c("grey", "blue", "red", "blue", "grey"),
                    values = scales::rescale(c(0, 22, 25, 26, 29, 40))) +

stepsn

and binned

scale_colour_binned(type = "viridis",
                      breaks = c(22, 25, 26, 29),
                      limits = c(0, 40),
                      guide = guide_coloursteps(even.steps = FALSE,
                                                show.limits = TRUE)) +

binned

binned is the closest but I can't work out how to specify the 5 colours I want (i.e., not viridis). The help option for 'type' that I presumably want is:

A function that returns a continuous colour scale

But I can't mentally square how/why I'd need to create a function that creates a continuous scale of colours, given I simply want to say (e.g.)

c("red", "yellow", "green", "yellow", "red")

I get the sense that either I'm close and am just misunderstanding a step somewhere (maybe multiple places), or I'm using the wrong approach, and it's none of these 3 functions.

Any chance anyone can advise?

Thanks all!

Edit: update following Will's suggestions: Leaving the values in the geom_point call untouched, i.e.

geom_point(aes(colour = MeanETemp24hU2M), size = 0.5) +

(rather than rescaling them to 0:1), the following code:

scale_colour_stepsn(colours = c("red", "yellow", "green", "yellow", "red"),
                      limits = c(0, 40),
                      guide = guide_coloursteps(even.steps = FALSE,
                                                show.limits = TRUE),
                      breaks = c(0, 22, 25, 26, 29, 40)) +

Produces:

stepsn close

Which is close but

A. the colours vector doesn't match the breaks bins

B. the colours themselves are blends rather than being the true R colours. Using a modification of Will's example:

x <- 1:100
y <- runif(100) * 100
tibble(x, y) %>% 
  ggplot() +
  aes(x = x, y = y, color = y) +
  geom_point() +
  scale_color_stepsn(
    colours = c("red", "yellow", "green", "yellow", "red"),
    breaks = c(0, 0.2, 0.4, 0.6, 0.8, 1) * 100)

Produces:

willexample

The values are right and align to the colour bins properly (IDK why mine don't given the same formulation), but the colours are diluted blends - compare the red in that image to the red at the end of the rainbow in the second image.


Solution

  • Looks like you were close for scale_color_stepsn, if you pass your rescale argument to breaks instead, it might work. I think the following does what you were hoping for?

    library(tidyverse)
    
    x <- 1:100
    y <- runif(100)
    tibble(x, y) %>% 
      ggplot() +
      aes(x = x, y = y, color = y) +
      geom_point() +
      scale_color_stepsn(
        colours = c("red", "yellow", "green", "yellow", "red"),
        breaks = c(0, 0.2, 0.4, 0.6, 0.8, 1)
      )
    

    Can replace scale_color_gradientn, which is much prettier!

    scale_color_gradientn(
        colours = c("red", "yellow", "green", "yellow", "red"),
        breaks = c(0, 0.2, 0.4, 0.6, 0.8, 1)
    

    EDIT:

    The rescale issue is definitely causing a problem. I think this solution will work for scales_color_gradientn. Note I changed the test data frame up a little.

    library(tidyverse)
    
    x <- seq(0, 40, 0.25)
    y <- seq(0, 40, 0.25)
    tibble(x, y) %>% 
      ggplot() +
      aes(x = x, y = y, color = y) +
      geom_point() +
      scale_color_gradientn(
        colours = c("red", "yellow", "green", "yellow", "red"),
        values = scales::rescale(c(0, 22, 25, 26, 29, 40))
      )
    

    I tried several times to get something nice with scales_color_stepsn, but as you saw, this function seems to assign the "pure" colors at the breaks and merges the colors for values in between them (I wasn't able to understand some of the other unusual behavior with colors being out of order, etc., though). I think if you want a stepped gradient of pure red/yellow/green colors, my original comment is probably the best approach - assign a value in your data frame to map the colors to. The legend in the above code has evenly spaced labels, if you want to label the the breaks, add breaks = c(0, 22, 25, 26, 29, 40) to the arguments, although I found it difficult to read the labels that way. HTH.