I would like to set all the 40 categorical variables in my datafile against each other (= 160 crosstabs) and gather the p-values of all the Chi-Square tests, preferably in one list, in order to see which variables are most closely related.
Is there an R code to execute this request in a simple way?
You can use comb
function to find all combinations and run any number of variables against each other.
As a simple solution, if you have a data.table
named dt
, and the independent variable is result
, then use the following code.
library(data.table)
library(magrittr)
library(dplyr)
chi_dt <- dt %>%
map(~chisq.test(.x, dt$result)) %>%
tibble(names = names(.), data = .) %>%
mutate(stats = map(data, broom::tidy)) %>%
unnest(stats) %>% select(-data) %>%
arrange(p.value, desc(statistic))