I want to perform chi-square test of independence on the following dataset. The dataset consists of four categorical variables. The test is performed on two variables at a time with variable V4 fixed. Essentially, I want to perform chi-square for 3 combinations: V1-V4, V2-V4, and V3-V4. Now, I want to perform this in a loop since the actual analysis consists of operations over a large number of combinations.
V1 V2 V3 V4
A SUV Yes Good
A SUV No Good
B SUV No Good
B SUV Yes Satisfactory
C car Yes Excellent
C SUV No Poor
D SUV Yes Poor
D van Yes Satisfactory
E car No Excellent
What I have tried:
x <- c(1:3)
for (i in x) {
test <- chisq.test(df[, i], df[, 4])
out <- data.frame("X" = colnames(df)[i]
, "Y" = colnames(df[4])
, "Chi.Square" = round(test$statistic,3)
, "df"= test$parameter
, "p.value" = round(test$p.value, 3)
)
return(out)
}
However, I only receive the output for V1-V4 combination. Reference for code: Chi Square Analysis using for loop in R
out
is getting replaced in each iteration with the current output and the result OP got is from the last iteration. We can initialize with a list
with length
of 'x' to store the output
x <- 1:3
out <- vector('list', length(x))
for (i in x) {
test <- chisq.test(df[, i], df[, 4])
out[[i]] <- data.frame("X" = colnames(df[i]),
"Y" = colnames(df[4]),
"Chi.Square" = round(test$statistic, 3),
"df" = test$parameter,
"p.value" = round(test$p.value, 3))
}