Search code examples
rsurvey

Likert Analysis: Using wilcoxon to determine significance between pre and post survey data in R


I have very little experience in statistics, and any help would be greatly appreciated. I am currently using R to do this.

I am trying to determine the significance of likert analysis data. I have one sheet each of our pre and post survey with 25 questions (columns), with responses from 59 students (rows).

I am using the Wilcoxon signed rank test, ie. testing the significance of the first question by comparing the first column from pre survey data with the first column of the post survey data, then the second question, so on and so forth.

How do I write a for loop to run the Wilcoxon test across responses to all questions?

I've found this so far, but haven't been able to run it correctly. Even help with understanding some of the syntax would be great.

test.fun <- function(dat, col) { 

 c1 <- combn(unique(dat$group),2)  sigs <- list()  for(i in 1:ncol(c1)) {
    sigs[[i]] <- wilcox.test(
                   dat[dat$group == c1[1,i],col],
                   dat[dat$group == c1[2,i],col]
                 )
    }
    names(sigs) <- paste("Group",c1[1,],"by Group",c1[2,])

 tests <- data.frame(Test=names(sigs),
                    W=unlist(lapply(sigs,function(x) x$statistic)),
                    p=unlist(lapply(sigs,function(x) x$p.value)),row.names=NULL)

 return(tests) }


tests <- lapply(colnames(dat)[-1],function(x) test.fun(dat,x)) names(tests) <- colnames(dat)[-1]

Thank you so much in advance.


Solution

  • Here's an example of running wilcoxon test based on what you'd describe:

    Example data: assuming that the response are rated as 1 to 5, the pre and post surveys are kept in separate separately, with matching column names.

    pre <- replicate(25, sample(1:5, 59, replace=T))
    post <- replicate(25, sample(1:5, 59, replace=T))
    colnames(pre) <- colnames(post) <- paste0("Q", 1:25)
    

    To run wilcoxon test, use lapply function to loop each of column name (x) that will insert the chosen pre[,x] and post[,x] data for the test. Results of all 25 tests return as a list (all.test):

    all.test <- lapply(colnames(post), function(x) wilcox.test(pre[,x], post[,x]))
    

    To extract statistics and p.value for all tests at once, use sapply to loop each wilcoxon test and extract ('[') both "statistics" and "p.value" and return as a matrix.

    sapply(all.test, '[', c("statistic", "p.value"))
              [,1]      [,2]    [,3]     [,4]      [,5]      [,6]      [,7]      [,8]       [,9]      [,10]    
    statistic 1556.5    1931    1718     1894.5    1659      1785      1652.5    2066.5     1912.5    1658     
    p.value   0.3122574 0.29259 0.903439 0.3976938 0.6555535 0.8086088 0.6301387 0.07366355 0.3451189 0.6523042
              [,11]     [,12]     [,13]    [,14]      [,15]     [,16]     [,17]      [,18]     [,19]     [,20]    
    statistic 1605.5    1723      1678.5   1368       1810      1813      2082        1595      1862.5    1588     
    p.value   0.4591325 0.9254467 0.735141 0.04035589 0.7040117 0.6920512 0.06084165 0.4222573 0.5027951 0.4028304
              [,21]     [,22]     [,23]     [,24]     [,25]    
    statistic 1986.5    1816.5    1854      1819.5    1859.5   
    p.value   0.1772745 0.6779568 0.5344984 0.6661272 0.5147567