Search code examples
rggplot2factorsfacet-wrap

Color and faceting by multiple factors in ggplot


I have a data.frame that I'm trying to plot in a facetted manner with R's ggplot's geom_boxplot:

set.seed(1)

vals <- rnorm(12)
min.vals <- vals-0.5
low.vals <- vals-0.25
max.vals <- vals+0.5
high.vals <- vals+0.25


df <- data.frame(sample=c("c0.A_1","c0.A_2","c1.A_1","c1.A_2","c2.A_1","c2.A_2","c0.B_1","c0.B_2","c1.B_1","c1.B_2","c2.B_1","c2.B_2"),
                 replicate=rep(c(1,2),6),val=vals,min.val=min.vals,low.val=low.vals,max.val=max.vals,high.val=high.vals,
                 group=c(rep("A",6),rep("B",6)),cycle=rep(c("c0","c0","c1","c1","c2","c2"),2),
                 stringsAsFactors = F)

In this example there are two factors which I'd like to facet:

facet.factors <- c("group","cycle")
for(f in 1:length(facet.factors)) df[,facet.factors[f]] <- factor(df[,facet.factors[f]],levels=unique(df[,facet.factors[f]]))
levels.vec <- sapply(facet.factors,function(f) length(levels(df[,f])))

But in other cases I may have only one or more than two factors.

Is there a way to pass to facet_wrap the vector of factors by which to facet and the number of columns?

Here's what I tried, where in addition I created my own colors for each factor level:

library(RColorBrewer,quietly=T)
library(scales,quietly=T)
level.colors <- brewer.pal(sum(levels.vec),"Set2")

require(ggplot2)
ggplot(df,aes_string(x="replicate",ymin="min.val",lower="low.val",middle="val",upper="high.val",ymax="max.val",col=facet.factors,fill=facet.factors))+
  geom_boxplot(position=position_dodge(width=0),alpha=0.5,stat="identity")+
  facet_wrap(~facet.factors,ncol=max(levels.vec))+
  labs(x="Replicate",y="Val")+
  scale_x_continuous(breaks=unique(df$replicate))+
  scale_color_manual(values=level.colors,labels=unname(unlist(sapply(facet.factors,function(f) levels(df[,f])))),name="factor level")+scale_fill_manual(values=level.colors,labels=unname(unlist(sapply(facet.factors,function(f) levels(df[,f])))),name="factor level")+
  theme_bw()+theme(legend.position="none",panel.border=element_blank(),strip.background=element_blank(),axis.title=element_text(size=8))

which obviously throws this error:

Error in combine_vars(data, params$plot_env, vars, drop = params$drop) : 
  At least one layer must contain all variables used for facetting

Clearly this works:

ggplot(df,aes_string(x="replicate",ymin="min.val",lower="low.val",middle="val",upper="high.val",ymax="max.val",col=facet.factors,fill=facet.factors))+
  geom_boxplot(position=position_dodge(width=0),alpha=0.5,stat="identity")+
  facet_wrap(group~cycle,ncol=max(levels.vec))+
  labs(x="Replicate",y="Val")+
  scale_x_continuous(breaks=unique(df$replicate))+
  scale_color_manual(values=level.colors,labels=unname(unlist(sapply(facet.factors,function(f) levels(df[,f])))),name="factor level")+scale_fill_manual(values=level.colors,labels=unname(unlist(sapply(facet.factors,function(f) levels(df[,f])))),name="factor level")+
  theme_bw()+theme(legend.position="none",panel.border=element_blank(),strip.background=element_blank(),axis.title=element_text(size=8))

enter image description here

But it ignores the colors I'm passing and doesn't add the legend, I imagine since I cannot pass a vector to col and fill in aesthetics, and clearly I have to hard code the facetting.

This doesn't work either for the facetting problem:

ggplot(df,aes_string(x="replicate",ymin="min.val",lower="low.val",middle="val",upper="high.val",ymax="max.val",col=facet.factors,fill=facet.factors))+
      geom_boxplot(position=position_dodge(width=0),alpha=0.5,stat="identity")+
      facet_wrap(facet.factors[1]~facet.factors[2],ncol=max(levels.vec))+
      labs(x="Replicate",y="Val")+
      scale_x_continuous(breaks=unique(df$replicate))+
      scale_color_manual(values=level.colors,labels=unname(unlist(sapply(facet.factors,function(f) levels(df[,f])))),name="factor level")+scale_fill_manual(values=level.colors,labels=unname(unlist(sapply(facet.factors,function(f) levels(df[,f])))),name="factor level")+
      theme_bw()+theme(legend.position="none",panel.border=element_blank(),strip.background=element_blank(),axis.title=element_text(size=8))

So my questions are: 1. Is there a way to pass a vector to facet_wrap? 2. Is there a way to color and fill by a vector of factors rather by single ones?


Solution

  • We cannot specify two colors for coloring/filling to a single box, I suggested that the faceting variables be pasted together as coloring/filling scale:

    df$col.fill <- Reduce(paste, df[facet.factors])
    

    facets of facet_wrap accepts both character vector or a one sided formula:

    facet.formula <- as.formula(paste('~', paste(facet.factors,  collapse = '+')))
    

    So the code finally looks like this:

    ggplot(df,
           aes_string(
               x = "replicate", ymin = "min.val", ymax = "max.val",
               lower = "low.val", middle = "val", upper = "high.val",
               col = "col.fill", fill = "col.fill"
           )) +
        geom_boxplot(position = position_dodge(width = 0),
                     alpha = 0.5,
                     stat = "identity") +
        facet_wrap(facet.factors, ncol = max(levels.vec)) +
        # alternatively: facet_wrap(facet.formula, ncol = max(levels.vec)) +
        labs(x = "Replicate", y = "Val") +
        scale_x_continuous(breaks = unique(df$replicate)) +
        theme_bw() +
        theme(
            #legend.position = "none",
            panel.border = element_blank(),
            strip.background = element_blank(),
            axis.title = element_text(size = 8)
        )
    

    The legend is not displayed because you added legend.position = "none",. enter image description here

    BTW, it would definitely improve readibility if you add some space and line break in you code.