Search code examples
rggplot2legendmean

R plot mean with stat_summary multiple factors group function explain


I am trying to understand how the 'group' function in stat_summary works and can't find good documentation on this. This is my problem:

Example dataframe

df <- data.frame(x = c(1, 2, 4, 3, 1.5, 4, 3, 2, 6, 3, 4, 2, 5, 0, 1, 3, 5, 4),
                 factor_col = c(rep("A", 18)),
                 mouse_ID = c(1:18))

df2 <- data.frame(x = df$x + 3,
                 factor_col = c(rep("B", 18)),
                 mouse_ID = c(1:18))

Table = bind_rows(df, df2)

Table$mouse_ID = as.factor(Table$mouse_ID)
Table$factor_col = as.factor(Table$factor_col) 

and I want to color lines for mouse_ID to see individuals variations between manipulation A and manipulation B of the grouping factor factor_col, but I also want to plot the mean of the change disregarding mouse_ID, this is the code I use:

(b = Table %>%
    ggplot(aes(x=factor_col, y=x, color = mouse_ID, group =mouse_ID)) +
    geom_point() +
    geom_line() +
    stat_summary(aes(y = x, group = factor_col), fun.y=mean, colour="black", geom="line", group=1, size=3) +
    xlab("Manipulations") +
    #ylim(0,1)+
    ylab("x-value") +
    labs(title = "")+
    theme_Publication() +
    theme(axis.text.x = element_text(angle = 45, hjust = 1)))

This code I think does what I want but I don't understand why group=1 in stat_summary, what is this 1? and why do I have to repeat 'group' twice in stat_summary? and how can I add 'Means' in the legend of colors?

Thank you!


Solution

  • Since you want one mean line for your complete dataset, you only need group =1 once in your aes (1):

    You can use geom_line() for line charts to display values over time. geom_line() requires an additional group= aesthetic. If there should be only 1 line because there is only 1 time variable, then use group=1. If you want to split the lines based on another variable, use group=variable_name.

    You can use the following code:

    library(dplyr)
    library(ggplot2)
    
    Table %>%
      ggplot(aes(x=factor_col, y=x, color = mouse_ID, group =mouse_ID)) +
      geom_point() +
      geom_line() +
      stat_summary(aes(y = x, group = 1), fun.y=mean, colour="black", geom="line", size=3) +
      xlab("Manipulations") +
      #ylim(0,1)+
      ylab("x-value") +
      labs(title = "")+
      theme(axis.text.x = element_text(angle = 45, hjust = 1))
    

    Created on 2023-06-30 with reprex v2.0.2