Search code examples
rggplot2colorsscale

How to rescale color mapping in scale_color_distiller (ggplot2)?


My data consists of three numeric variables. Something like this:

set.seed(1)
df <- data.frame(x= rnorm(10000), y= rnorm(10000))
df$col= df$x + df$y + df$x*df$y

Plotting this as a heatplot looks good:

ggplot(df, aes(x, y, col= col)) + geom_point(size= 2) + scale_color_distiller(palette = "Spectral")

Heatplot normal data

But real variables can have some skewness or outliers and this totally changes the plot. After df$col[nrow(df)] <- 100 same ggplot code as above returns this plot:

enter image description here

Clearly, the problem is that this one point changes the scale and we get a plot with little information. My solution is to rank the data with rank() which gives a reasonable color progression for any variable I`ve tried so far. See here:

ggplot(df, aes(x, y, col= rank(col))) + geom_point(size= 2) + scale_color_distiller(palette = "Spectral")

Ranked data with outlier

The problem with this solution that the new scale (2,500 to 10,000) is shown as the color label. I want the original scale to be shown as color label (o to 10). Therefor, I want that the color progression corresponds to the ranked data; i.e. I need to somehow map the original values to the ranked color values. Is that possible? I tried to change limits argument to limits= c(0, 10) inside scale_color_distiller() but this does not help.

Sidenotes: I do not want to remove the outlier. Ranking works well. I wan to use scale_color_distiller(). If possible, I want not to use any additional packages than ggplot2.


Solution

  • rescale the rank to the range of your original df$col.

    library(tidyverse)
    set.seed(1)
    df <- data.frame(x = rnorm(10000), y = rnorm(10000))
    df$col= df$x + df$y + df$x*df$y
    df$col[nrow(df)] <- 100
    
    
    df %>%
      mutate(
        scaled_rank = scales::rescale(rank(col), range(col))
      ) %>%
      ggplot(aes(x, y, col = scaled_rank)) +
      geom_point(size = 2) +
      scale_color_distiller(palette = "Spectral")
    

    Created on 2023-03-18 with reprex v2.0.2