Search code examples
rggplot2scale-color-manual

Setting factor level to be invisible in ggplot2


I think there was an update to ggplot2 that I missed and some things have apparently changed. For a factor level of a variable to be omited or "invisible" you used to be able to set a named vector with the level set to NA in scale_color_manual as in the example below:

df_test <- tibble(y = c(100, -50, 20),
                  group = c("A", "B", "C"),
                  Marker = c("above average", " ", "below average"))

ggplot(data = df_test, mapping = aes(x = group, y = y, color = Marker)) +
  geom_bar(stat = "identity", position = position_dodge(width = 0.9), size = 1.5, fill = "yellow") +
  scale_color_manual(values = c("above average" = "green3",
                                "below average" = "red",
                                " " = NA))

Unfortunately, this doesnt work anymore and instead sets the factor level to gray instead. My desired output would be for both the "B" group to not have an outline and the factor level to not be shown in the legend. Using the "alpha" aesthetic does not work for me, as in my actual code I am allready using alpha for something else.

Thanks in advance!


Solution

  • You're right in noticing that there were a few changes recently to how manual scales work with their values, and what happens to missing or surplus values. It seems like it's now easier to use a named vector to set the scale limits. (See issues 4511, 4534, 4569)

    Two more options:

    Using those scale changes, supply the scale values as a named vector missing the level you don't want included. That will give it an NA value (defaults to gray), which you can then drop with na.value = NA.

    ggplot(df_test, aes(x = group, y = y, color = Marker)) +
      geom_col(fill = "yellow", size = 1.5) +
      scale_color_manual(values = c("above average" = "green3", "below average" = "red"), 
                         na.value = NA)
    

    The second option is a bit silly, but is sometimes useful (I use it for making maps where different geographic levels need borders drawn based on their place in a hierarchy). Instead of all bars having the same size outline, set the size scale manually with the blank level getting a size of 0. Again, because the " " value is missing from the color vector, it won't be included in that legend; then drop the size legend. If you need the legend to have thick borders, use its scale's override.aes argument.

    library(ggplot2)
    df_test <- tibble::tibble(y = c(100, -50, 20),
                      group = c("A", "B", "C"),
                      Marker = c("above average", " ", "below average"))
    
    ggplot(df_test, aes(x = group, y = y, color = Marker)) +
      geom_col(aes(size = Marker), fill = "yellow") +
      scale_color_manual(values = c("above average" = "green3", "below average" = "red")) +
      scale_size_manual(values = c("above average" = 1.5, "below average" = 1.5, " " = 0), 
                        guide = guide_none())