I have a dataframe with three variables: Var1 (with values of A, B, and C), Var2 (with values of X and Y), and Metric (various numeric values). For every group in Var1 there exists multiple of each Var2 (unique by other variables but I am collapsing them here and are not relevant).
For every Var1 group I would like to compare the means of Metric between Var2 groups (X mean vs. Y mean) using a statistical test to determine if they are significantly different from each other (the exact test is not important but ideally the solution is agnostic). Doing this in isolation is easy and I have done so with filtering and TukeyHSD and selected my pairs of interest. With the below example I have filtered for 'A' in Var1 and get the p-value for X vs Y:
#Generating dataframe
set.seed(3)
df = data.frame(Var1 = sample(c("A","B","C"), 15, replace=TRUE),
Var2 = sample(c("X","Y"),15, replace=TRUE),
Metric = 1:15)
df
#Calculating significance between X and Y for A in Var1
stat = aov(Metric ~ Var2, data = subset(df, Var1 %in% c("A")))
summary(stat)
TukeyHSD(stat)
Ideally I would like a way to automate this so that the final result is in an easily accessible format or list of outputs in R such that I have a p-value for X vs Y for A, B, and C groups using some form of loop where I just have to input the desired Var1 groups of interest (as my actual dataset has much more than 3 groups for it).
If I understood correctly
#Generating dataframe
set.seed(3)
df = data.frame(Var1 = sample(c("A","B","C"), 100, replace=TRUE),
Var2 = sample(c("X","Y"),100, replace=TRUE),
Metric = 1:100)
library(tidyverse)
df %>%
group_nest(Var1) %>%
mutate(AOV = map(data, ~aov(Metric ~ Var2, data = .x))) %>%
transmute(Tukey_HSD = map(AOV, TukeyHSD) %>% map(broom::tidy)) %>%
unnest(Tukey_HSD)
#> # A tibble: 3 x 7
#> term contrast null.value estimate conf.low conf.high adj.p.value
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Var2 Y-X 0 0.931 -20.6 22.4 0.930
#> 2 Var2 Y-X 0 -8.06 -27.3 11.2 0.400
#> 3 Var2 Y-X 0 8.90 -14.1 31.9 0.435
Created on 2021-11-04 by the reprex package (v2.0.1)
with variable filtering
filter_vars <- c("B", "C")
df %>%
filter(Var1 %in% filter_vars) %>%
group_nest(Var1) %>%
mutate(AOV = map(data, ~aov(Metric ~ Var2, data = .x))) %>%
transmute(Tukey_HSD = map(AOV, TukeyHSD) %>% map(broom::tidy)) %>%
unnest(Tukey_HSD)