Search code examples
rggplot2color-space

How do I control an unbalanced color scale with scale_fill_continuous_divergingx where one end should be logarithmic?


My question is similar to this one which asks about the same function but not the same.

I would like to apply a divergent color scale to p-values, where the value 0.05 is my mid point. Values above 0.05 should progress linearly, whereas values below should progress logarithmically. Is this possible with colorspace::scale_fill_continuous_divergingx or some other function from the colorspace package?

Alternatively, I could transform to a log-scale but I had a hard time marking the shift above 0.05 in a visually meaningful way.

Below you can find what I tried so far. Any ideas are very welcome.

df <- structure(list(name = c(3L, 12L, 15L, 14L, 5L, 18L, 11L, 4L, 
6L, 17L, 10L, 2L, 9L, 8L, 7L, 1L, 16L, 19L, 13L, 9L, 2L, 8L, 
15L, 16L, 17L, 4L, 19L, 10L, 7L, 1L, 6L, 5L, 11L, 12L), p_adjusted = c(4.32596750954342e-06, 
3.03135847907459e-05, 0.000118088275490085, 0.000131741744620688, 
0.000137720927111689, 0.00427368416054269, 0.00435924240679527, 
0.0105749752039341, 0.0108537078105272, 0.0156289799697254, 0.823419406127695, 
1, 1, 1, 1, 1, 1, 1, 3.57724493033791e-06, 9.05031572894023e-05, 
0.000118883184319132, 0.000143702004459057, 0.00033101896024948, 
0.00265474345049394, 0.00453440320908698, 0.00473248203895472, 
0.00508912585948996, 0.00881057444851548, 0.0200752446003521, 
0.024238863465647, 1, 1, 1, 1), group = c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L)), row.names = c(NA, 
-34L), class = c("tbl_df", "tbl", "data.frame"))

This uses more or less the default scale. The color scale provides a decent picture of what happens above 0.05.

ggplot2::ggplot(df, ggplot2::aes(x = group, y = name, fill = p_adjusted)) +
  ggplot2::geom_tile() +
  colorspace::scale_fill_continuous_divergingx(name = "p-value",
                                               mid = 0.05,
                                               palette = "RdYlBu") +
  ggplot2::theme_classic() +
  ggplot2::scale_x_discrete("Group") +
  ggplot2::scale_y_discrete("Feature")

A heatmap showing p-values from zero to one on a color scale from yellow to blue.

Then I tried to achieve a much faster progression of the scale below 0.05 in order to get more distinct colors there. This more or less fails.

ggplot2::ggplot(df, ggplot2::aes(x = group, y = name, fill = p_adjusted)) +
  ggplot2::geom_tile() +
  colorspace::scale_fill_continuous_divergingx(
    name = "p-value",
    mid = 0.05,
    palette = "RdYlBu",
    p1 = 1e-2,
    p2 = 1e-2,
    p3 = 1.5,
    p4 = 1.5
  ) +
  ggplot2::theme_classic() +
  ggplot2::scale_x_discrete("Group") +
  ggplot2::scale_y_discrete("Feature")

A heatmap showing p-values from zero to one on a color scale from yellow to blue. Values below 0.05 should turn to red but don't.

Lastly, I can use a log-transformed scale which clearly shows different magnitude below 0.05 but cannot differentiate above. Also, I would like to clearly distinguish the mid point 0.05 on the colorbar.

ggplot2::ggplot(df, ggplot2::aes(x = group, y = name, fill = p_adjusted)) +
  ggplot2::geom_tile() +
  colorspace::scale_fill_continuous_divergingx(
    name = "p-value",
    mid = 0.05,
    palette = "RdYlBu",
    labels = function(x)
      format(x, digits = 3, big.mark = ","),
    trans = "log"
  ) +
  ggplot2::theme_classic() +
  ggplot2::scale_x_discrete("Group") +
  ggplot2::scale_y_discrete("Feature")

A heatmap showing p-values from zero to one on a color scale from red to yellow. The color scale is log-transformed. Values below 0.05 quickly progress to red but values above are all yellow.


Solution

  • You were on the right idea with the log transformed scale. The only problem is that there is a bug wherein the midpoint doesn't get transformed. So, by pre-transforming the midpoint value, we should get a diverging scale centred at the midpoint.

    library(ggplot2)
    
    df <- structure(list(
      name = c(3L, 12L, 15L, 14L, 5L, 18L, 11L, 4L, 6L, 17L, 10L, 2L, 9L, 8L, 7L, 
               1L, 16L, 19L, 13L, 9L, 2L, 8L, 15L, 16L, 17L, 4L, 19L, 10L, 7L, 1L, 
               6L, 5L, 11L, 12L), 
      p_adjusted = c(4.32596750954342e-06, 3.03135847907459e-05, 
                     0.000118088275490085, 0.000131741744620688,
                     0.000137720927111689, 0.00427368416054269, 
                     0.00435924240679527, 0.0105749752039341, 0.0108537078105272, 
                     0.0156289799697254, 0.823419406127695, 1, 1, 1, 1, 1, 1, 1, 
                     3.57724493033791e-06, 9.05031572894023e-05,
                     0.000118883184319132, 0.000143702004459057, 
                     0.00033101896024948, 0.00265474345049394, 0.00453440320908698, 
                     0.00473248203895472, 0.00508912585948996, 0.00881057444851548, 
                     0.0200752446003521, 0.024238863465647, 1, 1, 1, 1), 
      group = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L)
      ), row.names = c(NA, -34L), class = c("tbl_df", "tbl", "data.frame"))
    
    ggplot2::ggplot(df, ggplot2::aes(x = group, y = name, fill = p_adjusted)) +
      ggplot2::geom_tile() +
      colorspace::scale_fill_continuous_divergingx(
        trans = "log10",
        name = "p-value",
        mid = log10(0.05),
        palette = "RdYlBu"
      ) +
      ggplot2::theme_classic() +
      ggplot2::scale_x_discrete("Group") +
      ggplot2::scale_y_discrete("Feature")
    

    Created on 2021-03-30 by the reprex package (v1.0.0)