Search code examples
rpairwise.wilcox.test

How do I carry out wilcox.test considering every 4 rows in R?


I have a very large data frame consisting of two variables (here A and B) and 134,000 observations (134000/4 = 33500 groups).

I'm a bit uncertain as to how to get my code to run a paired wilcox.test, but when applied to every four rows. As example data, I want to compare A vs B but considering rows 1:4 for the first output, 5:8 for the second and 9:12 for the third.

  df1 <- as.data.frame(cbind(A = c(0.67, 0.45,0.76, 0.67, 0.56, 0.88, 0.34, 0.56, 0.35, 0.45, 0.67, 0.87), 
      B = c(0.45, 0.54, 0.67, 0.86, 0.23, 0.56, 0.34, 0.66, 0.21, 0.55, 0.56, 0.45)))

##for one row only

   check <- wilcox.test(unlist(df1[1:4, 1]), unlist(df1[5:8, 2]))

I can see there are examples whereby the dataframe is in wide format ( so would be A1, A2, A3, A4, B1, B2, B3, B4) Run wilcoxon rank sum test on each row of a data frame, but I would prefer to keep it in long format if possible.

Any guidance would be greatly appreciated.


Solution

  • We could split by a grouping created with gl and apply the wilcox.test on each of the list element

    lapply(split(df1, as.integer(gl(nrow(df1), 4, nrow(df1)))), 
          function(x) wilcox.test(x$A, x$B))