Search code examples
rggplot2mixed-modelsconfidence-interval

How to plot 95% confidence intervals for multiple factor data in ggplot2 R?


df <- data.frame (yaxis = rnorm(120,5,1),
                  xaxis = rep(c("A","B","C","D","E", "F"), times = 20),
                  factor = rep(c("1","2","3","4","5", "6"), each = 20))

df$xaxis <- as.factor(df$xaxis)
df$factor <- as.factor(df$factor )

library(lmerTest)
mod <- lmer(yaxis ~  xaxis + (1|factor), df)
mod.fit <- predictInterval(mod, df) 
df.fit <- cbind(df, mod.fit)

ggplot(df.fit, aes(x = xaxis, y  = yaxis, color = xaxis, group = xaxis)) + 
 geom_point() + 
 theme_classic() + 
 geom_errorbar(aes(min = lwr, max = upr), color = "black", width = 0.3)  + 
 geom_point(aes(y = mean(fit), x = xaxis), position = position_dodge(width = 0.25), color = "black",size = 5) 

I want to plot the raw data with the predicted 95% CI and means. I tried the above but I get multiple confidence intervals for the fitted data. How can I correct it?

enter image description here


Solution

  • You can extract the fixed effect from your model using summary(mod)$coefficients. This allows you to get the prediction for each x axis value and calculate the 95% confidence interval of that prediction using simple arithmetic.

    library(tidyverse)
    library(lme4)
    
    mod <- lmer(yaxis ~  xaxis + (1|factor) + 0, df)
    
    summary_df <- summary(mod)$coefficients %>% 
      as_tibble() %>% 
      mutate(xaxis = LETTERS[1:6]) %>%
      rename(yaxis = Estimate) %>%
      mutate(ymin = yaxis - 1.96 * `Std. Error`,
             ymax = yaxis + 1.96 * `Std. Error`)
    
    ggplot(df, aes(x = xaxis, y  = yaxis, color = xaxis, group = xaxis)) + 
      geom_jitter(width = 0.1) +
      geom_errorbar(aes(ymin = ymin, ymax = ymax), data = summary_df,
                    width = 0.4, linewidth = 1, alpha = 0.4) +
      geom_point(color = 'black', size = 2, data = summary_df) +
      theme_classic() +
      scale_color_brewer(palette = 'Dark2')
    

    enter image description here