Search code examples
rggplot2plotp-valueggpubr

Plotting p-values with ggpubr in grouped plot results in error


I am plotting a grouped dotplot with , which works fine. But if I am using stat_compare_means from the ggpubr package to add s, it won´t work. When I am using compare_means outside of the plotting environment it works fine and I get the correct p-values with the dataset (using compare_means(X1 ~ X5, data=titer_lung, method="t.test").

I am using the following data:

> titer_lung
         X1  X2               X3    X4     X5
1  4.531479  NK       mutIRFE_FB Lunge group1
2  4.068186  NK       mutIRFE_FB Lunge group1
3  4.071882  NK       mutIRFE_FB Lunge group1
4  4.117271  NK       mutIRFE_FB Lunge group1
5  4.117271  NK       mutIRFE_FB Lunge group1
6  4.462398 -NK       mutIRFE_FB Lunge group2
7  4.643453 -NK       mutIRFE_FB Lunge group2
8  4.556303 -NK       mutIRFE_FB Lunge group2
9  4.724276 -NK       mutIRFE_FB Lunge group2
10 4.491362 -NK       mutIRFE_FB Lunge group2
11 3.903090  NK    mutIRFErev_FB Lunge group3
12 4.342423  NK    mutIRFErev_FB Lunge group3
13 4.113943  NK    mutIRFErev_FB Lunge group3
14 4.653213  NK    mutIRFErev_FB Lunge group3
15 4.230449  NK    mutIRFErev_FB Lunge group3
16 4.556303 -NK    mutIRFErev_FB Lunge group4
17 4.462398 -NK    mutIRFErev_FB Lunge group4
18 4.230449 -NK    mutIRFErev_FB Lunge group4
19       NA -NK    mutIRFErev_FB Lunge group4
20 4.591065 -NK    mutIRFErev_FB Lunge group4
21 4.230449  NK    d3IDE mutIRFE Lunge group5
22 4.531479  NK    d3IDE mutIRFE Lunge group5
23 4.812913  NK    d3IDE mutIRFE Lunge group5
24 4.544068  NK    d3IDE mutIRFE Lunge group5
25 4.342423  NK    d3IDE mutIRFE Lunge group5
26 4.380211 -NK    d3IDE mutIRFE Lunge group6
27 4.698970 -NK    d3IDE mutIRFE Lunge group6
28 4.716003 -NK    d3IDE mutIRFE Lunge group6
29 4.477121 -NK    d3IDE mutIRFE Lunge group6
30 4.740363 -NK    d3IDE mutIRFE Lunge group6
31 4.255273  NK d3IDE mutIRFErev Lunge group7
32 4.322219  NK d3IDE mutIRFErev Lunge group7
33 4.113943  NK d3IDE mutIRFErev Lunge group7
34 4.176091  NK d3IDE mutIRFErev Lunge group7
35 4.518514  NK d3IDE mutIRFErev Lunge group7
36 4.724276 -NK d3IDE mutIRFErev Lunge group8
37 4.462398 -NK d3IDE mutIRFErev Lunge group8
38 4.785330 -NK d3IDE mutIRFErev Lunge group8
39 4.431364 -NK d3IDE mutIRFErev Lunge group8
40 4.826075 -NK d3IDE mutIRFErev Lunge group8

And by using the following code I get the plot I am looking for:

titer_plot <- ggplot(titer_lung, aes(y=titer_lung$X1, x=titer_lung$X3, 
                                    fill=titer_lung$X2)) +
  geom_dotplot(aes(fill = titer_lung$X2, color = titer_lung$X2),
               trim = FALSE,
               binaxis='y', stackdir='center', dotsize = 0.8,
               position = position_dodge(0.8)
  )+
  stat_summary(fun.y = median, fun.ymin = median, fun.ymax = median,
               geom = "crossbar", width = 0.5, position = position_dodge(0.8)) +
  scale_fill_manual(values = c("black", "white"))+
  scale_color_manual(values = c("black", "black"))+
  xlab("")+
  ylab(expression('Viral titer/organ [log '[10]*']'))+
  theme_classic()+
  theme(text = element_text(size=10),
        axis.text.x = element_text(size=10, angle = 90, hjust=1),
        axis.text.y = element_text(size=10), 
        axis.title.y = element_text(size=10), legend.text = element_text(size=10),
        legend.title=element_blank())+
titer_plot 

When I want to compare group1 with group2 and group3 to group4 I write:

my_comparisons <- list(c("group1","group2"),c("group3","group4"))
titer_plot + stat_compare_means(comparisons=my_comparisons, label.y=0,
                     method = "t.test")

But then I receive the following error:

Removed 1 rows containing non-finite values (stat_signif).
Computation failed in stat_signif(): missing value where TRUE/FALSE needed

I guess it should not have a problem with the NA values, since compare_means also does not bother.

I am very happy with every help, thanks!!!


Solution

  • The problem is, that is doesn't know, where to put the brackets (x-coordinates)

    comparisons: A list of length-2 vectors. The entries in the vector are either the names of 2 values on the x-axis or the 2 integers that correspond to the index of the groups of interest, to be compared.

    titer_plot + stat_compare_means(aes(group=X2), method = "t.test",)  #specify the groups specifically
    label.y=5
    

    enter image description here

    I have no idea how to overcome this, but the underlying geom_signif has a "manual" argument. So I guess you could somehow supply the output of compare_means(X1 ~ X5, data=titer_lung, method="t.test") to this geom.

    You could also consider to have 8 entries on the x-axis

    ggplot(titer_lung, aes(y=X1, x=paste(X3,X2), fill=X2)) + ...
    
    my_comparisons <- list(c("mutIRFE_FB -NK","mutIRFE_FB NK"),
                       c("mutIRFErev_FB -NK","mutIRFErev_FB NK"),
                       c("d3IDE mutIRFE -NK","d3IDE mutIRFE NK"),
                       c("d3IDE mutIRFErev -NK","d3IDE mutIRFErev NK"))
    titer_plot +  stat_compare_means(comparisons=my_comparisons, label.y=5,
                     method = "t.test")
    

    enter image description here