Search code examples
rggplot2bar-chartstacked-bar-chart

Stacked bar plots with scaled colors ggplot2


I am trying to generate a stacked bar plot using ggplot2.

Here is my toy data set.

toy.df <- data.frame(Sample=1:8, 
                cluster=c(rep("A1", 2), rep("A2", 2), rep("A3", 2), rep("A4", 2)),
                num = c(100, 300, 200, 250, 250, 240, 120, 100),
                Category = c(" hypo.","others", "hypo.", "others", "hypo.", "others", "hypo.", "others"),
                Color=c(rep("orange", 2), rep("green", 2), rep("blue", 2), rep("purple", 2)))

I nee a bar plot where the x axis correspond to "cluster" and y axis correspond to column "num" and the stacked bars correspond to category (others, hypo). I like to use the colors that are set in the table however I need a lighter (scaled) color for "hypo." group.

Do you have any idea how I can plot this using ggplot.

Thanks


Solution

  • The quick and easy approach to achieve your desired result would be to use a manual scale based on the colors in your data, map cluster on fill and Category on alpha to add some transparency for the "hypo." category.

    Note: The first "hypo." contains a leading space so we actually have three different categories. For this reason I used trimws() to remove get rid of this leading space.

    dat <- data.frame(
      Sample = 1:8,
      cluster = c(rep("A1", 2), rep("A2", 2), rep("A3", 2), rep("A4", 2)),
      num = c(100, 300, 200, 250, 250, 240, 120, 100),
      Category = c(" hypo.", "others", "hypo.", "others", "hypo.", "others", "hypo.", "others"),
      Color = c(rep("orange", 2), rep("green", 2), rep("blue", 2), rep("purple", 2))
    )
    
    library(tidyverse)
    
    dat$Category <- trimws(dat$Category)
    
    pal_fill <- dplyr::distinct(dat, cluster, Color) |>
      tibble::deframe()
    
    ggplot(dat, aes(cluster, num)) +
      geom_col(aes(fill = cluster, alpha = Category),
        position = position_stack(reverse = TRUE)
      ) +
      geom_text(aes(label = num, group = Category),
        position = position_stack(vjust = .5, reverse = TRUE), show.legend = FALSE
      ) +
      scale_fill_manual(values = pal_fill) +
      scale_alpha_manual(values = c(.4, 1)) +
      guides(fill = "none", alpha = guide_legend(reverse = TRUE))
    

    As second option would be to use "true" lighter colors instead of transparent colors which however requires slightly more effort. For this approach you could map the interaction of cluster and Category on fill and lighten the colors used for "hypo." using e.g. colorspace::lighten. Additionally, to get a legend showing Category requires a hack, i.e. I add a fake alpha legend which I manipulate via the override.aes argument of guide_legend:

    
    pal_fill <- dplyr::distinct(dat, cluster, Category, Color) |>
      mutate(Color = if_else(grepl("^hypo", Category), colorspace::lighten(Color, .4), Color)) |>
      tidyr::unite(name, cluster, Category, sep = ".") |>
      tibble::deframe()
    
    ggplot(dat, aes(cluster, num)) +
      geom_col(aes(fill = paste(cluster, Category, sep = "."), alpha = Category),
        position = position_stack(reverse = TRUE)
      ) +
      geom_text(aes(label = num, group = Category),
        position = position_stack(vjust = .5, reverse = TRUE), show.legend = FALSE
      ) +
      scale_fill_manual(values = pal_fill) +
      scale_alpha_manual(values = c(1, 1)) +
      guides(
        fill = "none",
        alpha = guide_legend(
          reverse = TRUE,
          override.aes = list(fill = colorspace::lighten("grey40", c(0, .4)))
        )
      )