I have very little experience in statistics, and any help would be greatly appreciated. I am currently using R to do this.
I am trying to determine the significance of likert analysis data. I have one sheet each of our pre and post survey with 25 questions (columns), with responses from 59 students (rows).
I am using the Wilcoxon signed rank test, ie. testing the significance of the first question by comparing the first column from pre survey data with the first column of the post survey data, then the second question, so on and so forth.
How do I write a for loop to run the Wilcoxon test across responses to all questions?
I've found this so far, but haven't been able to run it correctly. Even help with understanding some of the syntax would be great.
test.fun <- function(dat, col) {
c1 <- combn(unique(dat$group),2) sigs <- list() for(i in 1:ncol(c1)) {
sigs[[i]] <- wilcox.test(
dat[dat$group == c1[1,i],col],
dat[dat$group == c1[2,i],col]
)
}
names(sigs) <- paste("Group",c1[1,],"by Group",c1[2,])
tests <- data.frame(Test=names(sigs),
W=unlist(lapply(sigs,function(x) x$statistic)),
p=unlist(lapply(sigs,function(x) x$p.value)),row.names=NULL)
return(tests) }
tests <- lapply(colnames(dat)[-1],function(x) test.fun(dat,x)) names(tests) <- colnames(dat)[-1]
Thank you so much in advance.
Here's an example of running wilcoxon test based on what you'd describe:
Example data: assuming that the response are rated as 1 to 5, the pre and post surveys are kept in separate separately, with matching column names.
pre <- replicate(25, sample(1:5, 59, replace=T))
post <- replicate(25, sample(1:5, 59, replace=T))
colnames(pre) <- colnames(post) <- paste0("Q", 1:25)
To run wilcoxon test, use lapply function to loop each of column name (x) that will insert the chosen pre[,x] and post[,x] data for the test. Results of all 25 tests return as a list (all.test):
all.test <- lapply(colnames(post), function(x) wilcox.test(pre[,x], post[,x]))
To extract statistics and p.value for all tests at once, use sapply to loop each wilcoxon test and extract ('[') both "statistics" and "p.value" and return as a matrix.
sapply(all.test, '[', c("statistic", "p.value"))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
statistic 1556.5 1931 1718 1894.5 1659 1785 1652.5 2066.5 1912.5 1658
p.value 0.3122574 0.29259 0.903439 0.3976938 0.6555535 0.8086088 0.6301387 0.07366355 0.3451189 0.6523042
[,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
statistic 1605.5 1723 1678.5 1368 1810 1813 2082 1595 1862.5 1588
p.value 0.4591325 0.9254467 0.735141 0.04035589 0.7040117 0.6920512 0.06084165 0.4222573 0.5027951 0.4028304
[,21] [,22] [,23] [,24] [,25]
statistic 1986.5 1816.5 1854 1819.5 1859.5
p.value 0.1772745 0.6779568 0.5344984 0.6661272 0.5147567