I fitted a linear mixed model (with lmer function). "Proportion" as my response variable and "Stage" as a fixed factor (with 5 levels) and Subject as a random factor.
LMM.fit<-lmer(Proportion~Stage+(1|Subject),data=tab)
Chisq Df Pr(>Chisq)
Stage 290.07 4 < 2.2e-16 ***
Post hoc pairwise comparisons were carried with the emmeans function to determine where the differences occurred across the Stages:
emmeans(LMM.fit, pairwise ~ Stage)$contrasts
contrast estimate SE df t.ratio p.value
Stage1 - Stage2 0.178 0.0505 105087 3.528 0.0038
Stage1 - Stage3 0.601 0.0363 565787 16.529 <.0001
Stage1 - Stage4 0.438 0.0496 224258 8.833 <.0001
Stage1 - Stage5 0.272 0.0497 295762 5.472 <.0001
Stage2 - Stage3 0.422 0.0514 103631 8.216 <.0001
Stage2 - Stage4 0.260 0.0615 95267 4.224 0.0002
Stage2 - Stage5 0.094 0.0620 107696 1.517 0.5516
Stage3 - Stage4 -0.163 0.0496 229001 -3.279 0.0092
Stage3 - Stage5 -0.328 0.0491 314661 -6.683 <.0001
Stage4 - Stage5 -0.166 0.0599 190576 -2.769 0.0446
Degrees-of-freedom method: kenward-roger
P value adjustment: tukey method for comparing a family of 5 estimates
I basically want to add the p-values shown in the emmeans results ON the boxplot shown above (between all the groups two by two in the same figure). I know there is the function stat_pvalue_manual()
but I stuggled to know how to use it with emmeans contrasts output
This is what I managed to do (not the most optimized code but it works...)
source("http://news.mrdwab.com/install_github.R")
install_github("mrdwab/overflow-mrdwab")
install_github("mrdwab/SOfun")
library(SOfun)
LMM.fit<-lmer(Proportion~Stage+(1|Subject),data=tab)
p.val.test<-pwpm(emmeans(LMM.fit, ~ Stage),means = FALSE, flip = TRUE,reverse = TRUE)
p.val.test<-sub("[<>]", "", p.val.test)
p.matx<-matrix(as.numeric((p.val.test)),nrow = length(p.val.test[,1]),ncol = length(p.val.test[,1])) #if your factor has 5 levels ncol and nrow=5
rownames(p.matx) <- colnames(p.matx) <-colnames(p.val.test)
p.matx[upper.tri(p.matx, diag=FALSE)] <- NA
stat.test<-subset(melt(p.matx),!is.na(value))
names(stat.test)<-c("group1","group2","p.adj")
stat.test[stat.test$p.adj<=0.001,"p.adj.signif"]<-"***"
stat.test[stat.test$p.adj>0.001 & stat.test$p.adj<=0.01,"p.adj.signif"]<-"**"
stat.test[stat.test$p.adj>0.01 & stat.test$p.adj<=0.05,"p.adj.signif"]<-"*"
stat.test[ stat.test$p.adj>0.05,"p.adj.signif"]<-"ns"
stat.test<-mc_tribble(stat.test) # copy & paste the result of this line before the line of ggboxplot!!
stat.test <- tribble(
~group1, ~group2, ~p.adj, ~p.adj.signif,
"Stage2","Stage1",0.0038,"**",
"Stage3","Stage1",1e-04,"***",
"Stage4","Stage1",1e-04,"***",
"Stage5","Stage1",1e-04,"***",
"Stage3","Stage2",1e-04,"***",
"Stage4","Stage2",2e-04,"***",
"Stage5","Stage2",0.5516,"ns",
"Stage4","Stage3",0.0092,"**",
"Stage5","Stage3",1e-04,"***",
"Stage5","Stage4",0.0446,"*")
bxp<-ggboxplot(tab, x = "Stage", y = "Proportion",color = "Stage", palette = "jco",add = "jitter",bxp.errorbar=T)
bxp + stat_pvalue_manual(stat.test,
y.position = max(tab$Proportion)+sd(tab$Proportion),
color = "midnightblue")
PS: this website helped ("GGPUBR: HOW TO ADD P-VALUES GENERATED ELSEWHERE TO A GGPLOT").