Search code examples
rggplot2histogram

Add means to histograms by group in ggplot2


I am following this source to do histograms by group in ggplot2.

The sample data looks like this:

 set.seed(3)
x1 <- rnorm(500)
x2 <- rnorm(500, mean = 3)
x <- c(x1, x2)
group <- c(rep("G1", 500), rep("G2", 500))

df <- data.frame(x, group = group)

And the code:

# install.packages("ggplot2")
library(ggplot2)

# Histogram by group in ggplot2
ggplot(df, aes(x = x, fill = group, colour = group)) + 
  geom_histogram(alpha = 0.5, position = "identity")

I know that adding a line like:

  +geom_vline(aes(xintercept=mean(group),color=group,fill=group), col = "red")

Should allow me to get what I am looking for, but I am obtaining just an histogram with one mean, not a mean by group: Histogram

Do you have any suggestions?


Solution

  • In addition to the previous suggestion, you can also use separately stored group means, i. e. two instead of nrow=1000 highly redundant values:

    ## a 'tidy' (of several valid ways for groupwise calculation):
    group_means <- df %>%
      group_by(group) %>%
      summarise(group_means = mean(x, na.rm = TRUE)) %>%
      pull(group_means)
    
    ## ... ggplot code ... +
        geom_vline(xintercept = group_means)