I just want to add Kruskal-Wallis test results that I compute outside the plot, at the top of some boxplots. I know I could calculate kruskal.test
in the same geom_text
line, but in this case I'd rather keep it outside since I already have it.
Please check the following MWE:
iris$treatment <- rep(c("A","B"), length(iris$Species)/2)
mydf <- melt(iris, measure.vars=names(iris)[1:4])
mydf$both <- factor(paste(mydf$treatment, mydf$variable, sep=' -- '), levels=(unique(paste(mydf$treatment, mydf$variable, sep=' -- '))))
##Signif levels comparing treatments per Species (regardless of variable)
addkw1 <- as.data.frame(mydf %>% group_by(Species) %>%
summarize(p.value = wilcox.test(value ~ treatment)$p.value)) #no need for kruskal, only 2 treatments
##Signif levels comparing variable per Species (regardless of treatment)
addkw2 <- as.data.frame(mydf %>% group_by(Species) %>%
summarize(p.value = kruskal.test(value ~ variable)$p.value))
##Signif levels comparing treatment+variable per Species
addkw3 <- as.data.frame(mydf %>% group_by(Species) %>%
summarize(p.value = kruskal.test(value ~ both)$p.value))
addkw1$TEST <- "Treatment"
addkw2$TEST <- "Variable"
addkw3$TEST <- "Treat:Var"
addkw <- rbind(addkw1, addkw2, addkw3)
addkw$p.adjust <- p.adjust(addkw$p.value, "BH")
sp <- "setosa"
addkw0 <- subset(addkw, Species==sp)
df0 <- subset(mydf, Species==sp)
pdf(file="test.df", height=15, width=15)
ggplot(df0, aes(x=both, y=value, fill=both)) + geom_boxplot() +
stat_summary(fun.y=mean, geom="point", shape=5, size=4) +
geom_text(data=addkw0,aes(x=0, y=0, label=p.adjust))
As you can see, I just want to plot the boxplots for setosa
, for which I have 3 Kruskal-Wallis p-values. I would like 3 lines at the top of the boxplot, like:
KW p-value Treatment = 9.96e01
KW p-value Variable = 1.92e39
KW p-value Treat:Var = 1.18e36
However, I get the following error which I do not know how to solve... my guess is there might be some clashing between the ggplot
and the geom_text
Error in FUN(X[[i]], ...) : object 'both' not found
Thanks to @baptiste's comment, I managed to avoid the error, but I still struggle at printing the 3 lines with the 3 KW values... Any help?
This is the new code:
ggplot(df0, aes(x=both, y=value)) + geom_boxplot(aes(fill=both)) +
stat_summary(fun.y=mean, geom="point", shape=5, size=4) +
geom_text(data=addkw0, aes(x=0, y=0, label=paste0("KW pv = ", formatC(p.adjust, format="e", digits=2))), hjust=0)
which produces this:
To control the position of geom_text
you need to specify the coordinates of each label. Since for all three strings you have specified x = 0
and y = 0
in the geom_text
call they overlap.
ggplot(df0, aes(x = both, y = value)) + geom_boxplot(aes(fill = both)) +
stat_summary(fun.y = mean, geom = "point", shape = 5, size = 4) +
geom_text(data = addkw0, aes(x = 7, y = c(4.5, 5, 5.5),
label = paste0("KW pv = ", formatC(p.adjust, format="e", digits=2))), hjust = 0)
for other applications, like labeling with text on top of each bar, one would need to calculate the appropriate coordinates before hand and incorporate them in a data frame corresponding to addkw0