I'm trying to plot a dodged cols with two subgroups: first are the "REGION" group, then the "name" subgroup that involves a "Pre" and "Post intervention.
I've tried ggsignif using geom_signif(comparisons = list(c("Pre_Ambos", "Post_Ambos"))
, but it shows me "Warning message: Computation failed in stat_signif()
Caused by error in if (scales$x$map(comp[1]) == data$group[1] | manual) ...
:
! missing value where TRUE/FALSE needed.
Here is my code:
# Build dataframe
REGION <- c("Arica", "Tarapacá", "Antofagasta", "Atacama", "Coquimbo", "Valparaíso", "Metropolitana", "O'Higgins", "Maule", "Ñuble", "Bíobío", "Araucanía", "Los Ríos", "Los Lagos", "Aysén", "Magallanes", "Chile")
Pre_Ambos <- c(11.33, 9.96, 10.24, 14.17, 13.43, 12.96, 11.47, 14.54, 14.58, 18.00, 12.19, 15.34, 16.10, 17.64, 16.34, 15.04, 13.96)
Post_Ambos <- c(8.54, 7.60, 7.86, 10.44, 10.01, 11.97, 9.45, 13.07, 13.76, 11.56, 10.37, 14.48, 13.14, 15.04, 14.74, 12.07, 11.51)
Dif_Ambos_Porc <- c(24.61, 23.74, 23.20, 26.30, 25.49, 7.67, 17.59, 10.10, 5.67, 35.80, 14.94, 5.60, 18.37, 14.74, 9.78, 19.76, 17.57)
Table1 <- data.frame(REGION, Pre_Ambos, Post_Ambos, Dif_Ambos_Porc)
View(Table1)
#Pivoting
Table1 |>
select(REGION, Pre_Ambos, Post_Ambos, Dif_Ambos_Porc) |>
pivot_longer(cols = c("Pre_Ambos", "Post_Ambos")) |>
mutate(name = forcats::fct_relevel(name, c("Pre_Ambos", "Post_Ambos"))) -> Table1_p
#Ploting
ggplot(Table1_p, aes(x = REGION, y = value, fill = name)) +
geom_col(position = "dodge") +
scale_x_discrete(limits = Table1_p$REGION)
Here is the graphic
I'm only interested in the significance between "Pre_Ambos" and "Post_Ambos" in each REGION. Thanks a lot
Your expected output doesn't make sense to me: in your example you have a single 'Pre-Ambos' and a single 'Post_Ambos' value for each region, so you're trying to do a statistical test (t-test or wilcox test) comparing two values. You can do this, e.g.
library(tidyverse)
library(ggsignif)
# Build dataframe
REGION <- c("Arica", "Tarapacá", "Antofagasta", "Atacama", "Coquimbo", "Valparaíso", "Metropolitana", "O'Higgins", "Maule", "Ñuble", "Bíobío", "Araucanía", "Los Ríos", "Los Lagos", "Aysén", "Magallanes", "Chile")
Pre_Ambos <- c(11.33, 9.96, 10.24, 14.17, 13.43, 12.96, 11.47, 14.54, 14.58, 18.00, 12.19, 15.34, 16.10, 17.64, 16.34, 15.04, 13.96)
Post_Ambos <- c(8.54, 7.60, 7.86, 10.44, 10.01, 11.97, 9.45, 13.07, 13.76, 11.56, 10.37, 14.48, 13.14, 15.04, 14.74, 12.07, 11.51)
Dif_Ambos_Porc <- c(24.61, 23.74, 23.20, 26.30, 25.49, 7.67, 17.59, 10.10, 5.67, 35.80, 14.94, 5.60, 18.37, 14.74, 9.78, 19.76, 17.57)
Table1 <- data.frame(REGION, Pre_Ambos, Post_Ambos, Dif_Ambos_Porc)
#Pivoting
Table1 |>
select(REGION, Pre_Ambos, Post_Ambos, Dif_Ambos_Porc) |>
pivot_longer(cols = c("Pre_Ambos", "Post_Ambos")) |>
mutate(name = forcats::fct_relevel(name, c("Pre_Ambos", "Post_Ambos"))) -> Table1_p
#Ploting
ggplot(Table1_p, aes(x = name, y = value, fill = name)) +
geom_col(position = "dodge") +
#scale_x_discrete(limits = REGION) +
geom_signif(comparisons = list(c("Pre_Ambos", "Post_Ambos"))) +
facet_wrap(~REGION, nrow = 1)
Created on 2024-05-15 with reprex v2.1.0
However, the p-values for each region are for each comparison (i.e. ~10 vs ~7.5 for the first region, p = ~1). This is fine if you have multiple values for each region, but if you want to conduct a test for all Pre_Ambos vs all Post_Ambos you need to change your figure, e.g.
ggplot(Table1_p, aes(x = name, y = value, fill = REGION)) +
geom_col(position = "dodge") +
geom_signif(comparisons = list(c("Pre_Ambos", "Post_Ambos")))
#> Warning in wilcox.test.default(c(11.33, 9.96, 10.24, 14.17, 13.43, 12.96, :
#> cannot compute exact p-value with ties
Created on 2024-05-15 with reprex v2.1.0
Does that make sense?