Search code examples
rggplot2viridis

Need specific coloring in ggplot2 with viridis


Here's the situation, I am generating complex stacked bar charts with 20+ entries. However, downstream this is often reduced to only 5 or 6 entries. I want to use the colors from this downstream set and carry those back through to the more complex samples.

Essentially I want anything that isn't in the final set to be colored gray. I currently don't know how I can go about doing this.

An additional wrinkle is the downstream data does not necessarily have the same shape as the upstream data. For context, this is a complex set of 16S biological sequencing data as well as pure DNA sequencing and classification.

My current thought is to somehow assign a color directly to a specific value, but I'm not entirely sure how to do this and how to determine which color is being displayed downstream by viridis.

Edit: These sets of data should be somewhat indicative of what I'm after:

First Set

 SampleID Abundance
 A 0.083
 B 0.083
 C 0.083
 D 0.083
 E 0.083
 F 0.083
 G 0.083
 H 0.083
 I 0.083
 J 0.083
 K 0.083
 L 0.083

Downstream Set

SampleID Abundance
A 0.25
E 0.25
I 0.25
J 0.25

In this case I want A, E, I, and J to have a consistent coloring and the other letters to be gray. I would also prefer to have all colored entries stacking together and then leave the gray on top. The other option I guess is to go back and remove all non entries and then add an asterisk saying, "missing regions are not found downstream."

Edit2: A mockup expected output of the original and downstream data

Example output


Solution

  • library(tidyverse)
    library(viridis)
    #> Loading required package: viridisLite
    
    first <- tribble(~SampleID, ~Abundance,
                     "A", 0.083,
                     "B", 0.083,
                     "C", 0.083,
                     "D", 0.083,
                     "E", 0.083,
                     "F", 0.083,
                     "G", 0.083,
                     "H", 0.083,
                     "I", 0.083,
                     "J", 0.083,
                     "K", 0.083,
                     "L", 0.083) %>% 
      mutate(Class = "First")
    
    downstream <- tribble(~SampleID, ~Abundance,
                          "A", 0.25,
                          "E", 0.25,
                          "I", 0.25,
                          "J", 0.25) %>% 
      mutate(Class = "Downstream")
    
    pal <- viridis(4)
    
    maps <- tibble(labels = LETTERS[1:12],
           colors = case_when(labels == "A" ~ pal[1],
                              labels == "E" ~ pal[2],
                              labels == "I" ~ pal[3],
                              labels == "J" ~ pal[4],
                              TRUE ~ "Grey50")) %>% 
      mutate(order = ifelse(colors == "Grey50", 2, 1)) %>% 
      arrange(order, labels)
    
    values <- set_names(maps$colors, maps$labels)
    
    plot_data <- bind_rows(first, downstream) %>% 
      mutate(SampleID = factor(SampleID, maps$labels),
             Class = factor(Class, c("First","Downstream"))) %>% 
      arrange(Class, SampleID)
    
    ggplot(plot_data, aes(x = Class, y = Abundance, fill = SampleID, group = Class)) +
      geom_col() +
      scale_fill_manual("Legend", values = values, breaks = LETTERS[1:12])
    

    Created on 2018-11-27 by the reprex package (v0.2.1)