Search code examples
rggplot2meanstandard-error

Display mean plus or minus standard error of the mean


I was trying to represent a dot plot with my data, including all the dots and the mean +/- standard error of the mean (SEM). I build a new data frame, called "Descriptive" with the values of the mean and SEM, and I manage to represent the dots and the mean, but not the SEM as error bars. Could anyone help me providing some advice or sugesting any idea for the script? I give you an example with iris data.

library(ggplot2)
library(dplyr)
iris<- iris
SEM <- function(x){sd(x)/sqrt(length(x))}
iris %>% 
  group_by(Species) %>% 
  summarise(MEAN=mean(Sepal.Length), SEM=SEM(Sepal.Length)) ->   
  Descriptive
print(Descriptive)
ggplot(iris, aes(x=Species, y=Sepal.Length))+
  geom_point()+
  stat_summary(fun=mean, shape=95, size=8, color=2)

Solution

  • The stat_summary function can handle all of this. The fun argument sets the middle value, but you can instead use the fun.data argument, with expects a dataframe with a ymin, y, and ymax value - letting it generate error bars:

    SEM <- function(x){data.frame(y=mean(x),
                                  ymin=mean(x)-sd(x)/sqrt(length(x)),
                                  ymax=mean(x)+sd(x)/sqrt(length(x)))}
    

    Now we can just use this with geom='errorbar' to get what you want:

    ggplot(iris, aes(x=Species, y=Sepal.Length))+
        geom_point() +
        stat_summary(fun.data=SEM, color = 'red', geom='errorbar', width=0.2)
    

    enter image description here

    One other issue is that you've got a lot of overplotting. You might want to use geom_jitter instead:

    ggplot(iris, aes(x=Species, y=Sepal.Length))+
        geom_jitter(width=0.05) +
        stat_summary(fun.data=SEM, color = 'red', geom='errorbar', width=0.2)
    

    enter image description here

    Depending on what you want, you can play around with the different geom's.

    "errorbar", used above, gives the nice bars, but doesn't have the mean marked on it's own (you'd need add the mean in as you did before separately:

    ggplot(iris, aes(x=Species, y=Sepal.Length))+
        geom_jitter()+
        stat_summary(fun=mean, shape=95, size=6, color=2) +
        stat_summary(fun.data=SEM, color = 'red', geom='errorbar', width=0.2)
    
    

    enter image description here

    'pointrange' gives a dot for mean + lines for the range. If you increase the size to make the line thicker, you need to reduce the fatten parameter, which makes the dot bigger than the line.

    ggplot(iris, aes(x=Species, y=Sepal.Length))+
        geom_jitter(width=0.05) +
        stat_summary(fun.data=SEM, color = 'red', geom='pointrange', size= 1, fatten=1)
    

    enter image description here

    'crossbar' gives a box with mean and error bars:

    ggplot(iris, aes(x=Species, y=Sepal.Length))+
        geom_jitter(width=0.05) +
        stat_summary(fun.data=SEM, color = 'red', geom='crossbar', width=0.2)
    

    enter image description here