Search code examples
rggplot2visualizationboxplotkendo-dataviz

Seperate subgroups in boxplot


I have visualized some data by using boxplots. Each variable belongs to a subgroup and I want to separate/indicate on these subgroups in the boxplot. Either to create a small gap between each 'group of variables' or in some way indicated on the different variable belonging without using fill = type. Below I've generated some data that can work as an example data.

require(tidyverse)

set.seed(1234)
value <- sample(20:80, 1000, replace = T)
group <- sample(c("A1", "A2", "B1", "B2", "C1", "C2"), 1000, TRUE)

df <- as.data.frame(cbind(value, group))

df <- df %>%
  mutate(value = as.numeric(value),
         type = ifelse(grepl("*2", group),"Group 2", "Group1"),
         ) 

ggplot(df, aes(x=value, y =group))+
  geom_boxplot() +
  xlim(0, 100)

This gives the following boxplot:

enter image description here

I tried to subgroup the plot by adding facet_wrap(~type, dir="v") which generates the following plot:

enter image description here

However, I don't know how to remove the empty variables (eg. in Group 2: A1, B2, C3).

Does anyone know how to, in an adequate way, (with facet_wrap() or not) on indicating on subgroups of variables in boxplot without using fill = type?


Solution

  • You can remove the not used factor levels with scales = "free":

    library(dplyr)
    library(ggplot2)
    
    set.seed(1234)
    value <- sample(20:80, 1000, replace = T)
    group <- sample(c("A1", "A2", "B1", "B2", "C1", "C2"), 1000, TRUE)
    
    df <- as.data.frame(cbind(value, group))
    
    df <- df %>%
      mutate(value = as.numeric(value),
             type = ifelse(grepl("*2", group),"Group 2", "Group1"),
      ) 
    
    ggplot(df, aes(x=value, y =group))+
      geom_boxplot() +
      xlim(0, 100) +
      facet_wrap(~type, dir="v", scales = "free")
    

    Created on 2020-09-11 by the reprex package (v0.3.0)