Let's say I have a data frame:
df = data.frame(var = c("a", "a", "b", "b", "c", "c", "a", "a", "b", "b", "c", "c", "a", "a", "b", "b", "c", "c"),
source = c("ref", "ref", "ref", "ref", "ref", "ref", "source1", "source1", "source1", "source1", "source1", "source1", "source2", "source2", "source2", "source2", "source2", "source2"),
value = c(2.5, 1, 3.5, 1.6, 2.2, 3.1, 2, 1.2, 1.8, 0.4, 1.4, 1.3, 3, 2.8, 4, 3.6, 2.9, 3.8))
> df
var source value
1 a ref 2.5
2 a ref 1.0
3 b ref 3.5
4 b ref 1.6
5 c ref 2.2
6 c ref 3.1
7 a source1 2.0
8 a source1 1.2
9 b source1 1.8
10 b source1 0.4
11 c source1 1.4
12 c source1 1.3
13 a source2 3.0
14 a source2 2.8
15 b source2 4.0
16 b source2 3.6
17 c source2 2.9
18 c source2 3.8
and I would like to generate density plots for value
for each var
/ source
pair. That works with:
library(tidyverse)
library(ggplot2)
df %>%
ggplot(aes(x = value)) +
geom_density(aes(y = ..density.., fill = source), adjust = 1, alpha = 0.5) +
facet_grid(source ~ var, scales = "fixed") +
theme_bw()
But what I really want, based on this example, is to have only two rows, corresponding to source1
and source2
and add another density curve in each of the plots based on the values from ref
.
I tried to find a solution following this post but I did not succeed.
In other words, I would like that for each plot in the grid to have the distribution of the values in ref
as reference, and the ref
group to not be taken into account in the plot legend.
Any help is highly appreciated. Thank you.
One option would be to split your dataframe in two, one containing the reference values, one containing the others. For the df containing the reference values we also have to drop the source
column. Then make use of two geom_density
. Removing the reference from the legend is not a big deal. Simply remove the fill aes and set your desired fill color if any as a parameter. In my code below I have simply set fill=NA
.
library(ggplot2)
df1 <- df[df$source == "ref", -2]
df2 <- df[!df$source == "ref", ]
ggplot(mapping = aes(x = value)) +
geom_density(data = df1, aes(y = ..density..), fill = NA, adjust = 1, alpha = 0.5) +
geom_density(data = df2, aes(y = ..density.., fill = source), adjust = 1, alpha = 0.5) +
facet_grid(source ~ var, scales = "fixed") +
theme_bw()