I am trying to add p_values to my graph using "stat_signif" function.
The problem is that my boxplots are grouped box plots where I want to compare every 2 box plots of the same category and stat_signif function requires the x-axis values for comparing.
This is my code:
p <- ggplot(plot.data, aes(x = Element, y = Value, fill = Group)) + #Define the elements for plotting - group by "strandness".
geom_boxplot(outlier.shape = NA, colour = "black") +
scale_fill_manual(values = c("goldenrod","darkgreen")) +
coord_cartesian(ylim = c(0, 0.03)) +
stat_summary(fun.y=mean, colour="black", geom ="point", shape=18, size=4 ,show.legend = FALSE, position = position_dodge(0.75)) +
theme(legend.title=element_blank(),legend.text = element_text(size=16), axis.text.x = element_text(color = "black", size = 12), axis.text.y = element_text(color = "black", size = 12),
panel.background = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black"),
panel.border = element_rect(colour = "black", fill=NA, size=0.5),
legend.key = element_rect(colour = "transparent", fill = "white")) +
theme(plot.title = element_text(lineheight=.8, hjust = 0.5, size = 20),axis.title.y = element_text(size = 20, angle = 90, margin = margin(t = 0, r = 20, b = 0, l = 0))) +
labs(x = "", y = paste0(dinuc, " frequency")) +
theme(plot.margin = unit(c(2,1,1,1), "cm")) +
#stat_compare_means(aes(group = group))
stat_signif(comparisons = list(c("Genes", "mRNA"))
,test = "wilcox.test", test.args = list(paired = FALSE, exact = FALSE, correct = FALSE,
map_signif_level = T), y_position = 0.02)
Where the plot.data data frame looks like:
Group, Value, Element
1 Transcribed, 0.004814926, Genes
2 Non-transcribed, 0.008926, Genes
3 Transcribed, 0.086000026, mRNA
4 Non-transcribed, 0.00548, mRNA
5 Transcribed, 0.258400078, Exons
6 Non-transcribed, 0.23008457, Exons
7 Transcribed, 0.00005687, Introns
8 Non-transcribed, 0.890000521, Introns
etc. (For every element there are about 10000 rows)
This is the figure obtained by the code: When I actually want to compare between the transcribed and non-transcribed box plots of every element.
You haven't posted enough data to get p-values, so I'm posting an example that you can adjust to your dataset:
library(tidyverse)
library(ggpubr)
mtcars %>%
mutate_at(vars(am, cyl), as.factor) %>%
ggplot(aes(cyl, disp, fill=am))+
geom_boxplot()+
stat_compare_means(aes(group = am))
You can usestat_compare_means(aes(group = am), label = "p.format")
if you want to have only the p values in your plot.