Search code examples
rggplot2violin-plot

Layering violin plots with geom_violin to compare distributions


I am trying to compare the distributions of a continuous variable across groups using violin plots. Pretty easy. However, I would like to make comparisons across distributions easier by showing the distribution for one of the groups (the reference) in grey with a low alpha value in the background. Something like this but with a violin plot:

enter image description here

My current approach plots the data twice. For the first geom_violin, I duplicate the data for the reference group and plot it in grey. For the second geom_violin, I use the actual data d. In this example, the two violin plots in grey and blue should look the same for the group "blue". However, they are NOT the same even though they are based on exactly the same data for group "blue".

How can I resolve this problem? Or is there another better approach to do this?

d <- tibble(
        group = sample(c("green", "blue"), 1000, replace = TRUE, prob = c(0.7, 0.3)),
        x = ifelse(group == "green", rnorm(1000, 1, 1), rnorm(1000, 0, 3))
    )

dblue <- filter(d, group == "blue")
dblue <- bind_rows(dblue, mutate(dblue, group = "green"))

ggplot(d, aes(x = factor(group), y = x)) +
    geom_violin(data = dblue, fill = alpha("#333333", 0.2), color = alpha("#333333", 0)) +
    geom_violin(fill = alpha("#0072B2", 0.8), color = alpha("#0072B2", 0))

enter image description here


Solution

  • Add scale = "width" to the second geom_violin

    ggplot(d, aes(x = factor(group), y = x)) +
      geom_violin(data = dblue, fill = alpha("#333333", 0.2), color = alpha("#333333", 0)) +
      geom_violin(fill = alpha("#0072B2", 0.8), color = alpha("#0072B2", 0),
                  scale = "width")
    

    enter image description here