Search code examples
rggplot2facet-gridggpubrkruskal-wallis

R - stat_compare_means return differnt value from Kruskal-Wallis test


I want to plot the p value of Kruskal-Wallis test to my ggplot using the R function stat_compare_means from the package ggpubr.

However, the plotted value is different from the value if I simply run the function:

kruskal.test(value ~ type, data = Profile_melt)

my code to plot the p value is:

ggplot(Profile_melt, aes(type, value)) + 
  geom_boxplot(aes(fill = factor(type), alpha = 0.5), 
               outlier.shape = NA, show.legend = FALSE) +
  geom_jitter(width = 0.2, size = 2, show.legend = FALSE,
              aes(colour = factor(type)), alpha = 0.5) +
  theme_bw() +
  facet_grid(Case ~ Marker, scales = 'free') +
  stat_compare_means(comparison = list(c("Real", "Binomial")),method = 'kruskal.test')+
  background_grid(major = 'y', minor = "none") + # add thin horizontal lines 
  xlab('Category') +
  ylab('Cell counts (Frequencies)')+
  theme(axis.text = element_text(size = 15), 
        axis.title = element_text(size = 20), 
        legend.text = element_text(size = 38),
        legend.title = element_text(size = 30), 
        strip.background = element_rect(colour="black", fill="white"),
        strip.text = element_text(margin = margin(10, 10, 10, 10), size = 25)) +
  panel_border()

Here is my data sample data


Solution

  • There are many code lines which may not be relevant to the question. Perhaps, your question could be:

    why does

    kruskal.test(value ~ type, data = Profile_melt)
    
    #Kruskal-Wallis chi-squared = 4.9673, df = 1, p-value = 0.02583
    

    produce a different p value from

    ggboxplot(Profile_melt, x="type", y = "value") + 
      stat_compare_means(comparison = list(c("Real", "Binomial")), method = 'kruskal.test')
    
    # p-value = 0.49
    

    You could work out the reason by checking original code. The developer of ggpubr may explain this better, and perhaps fix it there if it is an issue. To get correct and consistent p value, remove comparison = list(c("Real", "Binomial")):

    ggboxplot(Profile_melt, x="type", y = "value") + 
      stat_compare_means(method = 'kruskal.test')
    

    or

    Edit

    ggboxplot(Profile_melt, x="type", y = "value") + 
      stat_compare_means(comparison = list(c("Real", "Binomial")))
    

    With your other code, the graph looks like this:

    enter image description here