I currently use stat_compare_means()
to display the comparison test on a boxplot with ggplot2
. However, I would like the function to display the number of individuals (n = ) of the samples between the name of the test and the given p-value. Is there a way to add that information into stat_compare_means()
? I could easily display it with geom_text()
or annotate()
, but it isn't quite convenient since I want it to be written right before the p-value.
data(iris)
ggplot(iris, aes(x=Species, y=Sepal.Length)) +
geom_boxplot() +
geom_jitter(width = 0.1, shape=21, colour="black", fill="grey95", stroke=0.5, size=2) +
stat_compare_means(label.y.npc = 0.93, method = "kruskal.test", size = 5, family = "serif") +
theme_classic()
We could have the total n
by adjusting the label
aesthetic of stat_compare_means
:
ggplot(iris, aes(x=Species, y=Sepal.Length)) +
geom_boxplot() +
geom_jitter(width = 0.1, shape = 21, colour = "black", fill = "grey95",
stroke = 0.5, size = 2) +
stat_compare_means(label.y.npc = 0.93, method = "kruskal.test",
size = 5, family = "serif",
aes(label = paste0(after_stat(method),
', n =', nrow(iris),
after_stat(p.format)))) +
theme_classic()
Personally, I think it makes more sense to label the x axis with the number of observations in each group:
library(tidyverse)
iris %>%
group_by(Species) %>%
mutate(Species = paste0(Species, '\n(n = ', n(), ')')) %>%
ggplot(aes(x = Species, y = Sepal.Length)) +
geom_boxplot() +
geom_jitter(width = 0.1, shape = 21, colour = "black", fill = "grey95",
stroke = 0.5, size = 2) +
stat_compare_means(label.y.npc = 0.93, method = "kruskal.test",
size = 5, family = "serif") +
theme_classic(base_size = 16, base_family = 'serif')