Search code examples
rggplot2boxplotsignificance

Use stat_compare_means to test whether multiple groups are significantly different from zero?


I'm using ggpubr::stat_compare_means in ggplot2 to show significance for multiple boxplots. I am trying to find a way to show whether each of my boxplots is significantly different from a certain value (0), but I can only find ways to compare whether they are different from a particular group, or the means of all groups.

Here is what my plot looks like:

example plot

Some groups are above 0, some are below 0. What I want to test is whether each is significantly different from 0.

Currently stat_compare_means is calculating significance from this argument:

stat_compare_means(label = "p.signif", method = "t.test", ref.group = ".all.", label.y=1.1)

I know I need to change the "ref.group" argument. In this case I think it's taking the mean of all groups, and testing whether each group is significantly different from it.

The documentation for ref.group says:

"A character string specifying the reference group. If specified, for a given grouping variable, each of the group levels will be compared to the reference group (i.e. control group). ref.group can be also ".all.". In this case, each of the grouping variable levels is compared to all (i.e. basemean)."

Since this requires a character string, the only way I can think of to compare my groups to 0 is to make a dummy control group of 0's which will be the reference group. Then I can direct to that group in the ref.group argument.

Is there any other way to compare these groups to 0?? Thanks.

  df%>%ggplot(aes(x=species,y=weighted_change))+
  geom_hline(yintercept=0,linetype="dashed")+
  geom_boxplot(color="orangered")+
  labs(x="Species",y="Mean Change",title="Central Basin and Range")+
  theme(plot.title = element_text(hjust = 0.5,size = 12, face = "bold"))+
  theme(axis.title = element_text(size = 10, face = "bold"))+
  theme(axis.text=element_text(size=10,face="bold"))+
  theme(axis.text.x = element_text(angle = 45, hjust = 1))+
  coord_cartesian(ylim=c(-1.1,1.1))+
  stat_compare_means(label = "p.signif", method = "t.test", ref.group = ".all.", label.y=1.1)

Solution

  • I don't think stat_compare_means will actually help you much here, as it seems to be designed mostly for comparisons between groups, and you want to do a calculation within each group. Instead, you can calculate the test yourself, before going to ggplot:

    library(tidyverse)
    
    # Test whether each group differs from 0
    t_tests = iris %>%
        group_by(Species) %>%
        summarise(P = t.test(Petal.Width, mu = 0)$p.value,
                  Sig = ifelse(P < 0.05, "*", "ns"),
                  MaxWidth = max(Petal.Width))
    
    ggplot(iris, aes(x = Species, y = Petal.Width)) +
        geom_boxplot() +
        # Use the prepared table of test results as data for the geom
        geom_text(aes(label = Sig, y = MaxWidth + 0.2), size = 6,
                  data = t_tests)
    

    Since I don't have your data I've used iris, but since iris also has a species column it should be fairly clear what's happening.

    Boxplot with significance stars