Search code examples
rfor-loopchi-squared

Fix a column in for loop while doing Chi-square test


I want to perform chi-square test of independence on the following dataset. The dataset consists of four categorical variables. The test is performed on two variables at a time with variable V4 fixed. Essentially, I want to perform chi-square for 3 combinations: V1-V4, V2-V4, and V3-V4. Now, I want to perform this in a loop since the actual analysis consists of operations over a large number of combinations.

V1  V2  V3  V4
A   SUV Yes Good
A   SUV No  Good
B   SUV No  Good
B   SUV Yes Satisfactory
C   car Yes Excellent
C   SUV No  Poor
D   SUV Yes Poor
D   van Yes Satisfactory
E   car No  Excellent

What I have tried:

x <- c(1:3)
for (i in x) {
  test <- chisq.test(df[, i], df[, 4])
  out <- data.frame("X" = colnames(df)[i]
                    , "Y" = colnames(df[4])
                    , "Chi.Square" = round(test$statistic,3)
                    ,  "df"= test$parameter
                    ,  "p.value" = round(test$p.value, 3)
  )
  return(out)
}

However, I only receive the output for V1-V4 combination. Reference for code: Chi Square Analysis using for loop in R


Solution

  • out is getting replaced in each iteration with the current output and the result OP got is from the last iteration. We can initialize with a list with length of 'x' to store the output

    x <- 1:3
    out <- vector('list', length(x))
    for (i in x) {
      test <- chisq.test(df[, i], df[, 4])
      out[[i]] <- data.frame("X" = colnames(df[i]),
                             "Y" = colnames(df[4]),
                             "Chi.Square" = round(test$statistic, 3),
                             "df" = test$parameter,
                             "p.value" = round(test$p.value, 3))
      
     }