Search code examples
rdataframecomparison

Comparing columns within the same group


My data frame:

data <- structure(list(group = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), col1 = c(9, 
9.05, 7.15, 7.21, 7.34, 8.12, 7.5, 7.84, 7.8, 7.52, 8.84, 6.98, 
6.1, 6.89, 6.5, 7.5, 7.8, 5.5, 6.61, 7.65, 7.68), col2 = c(11L, 
11L, 10L, 1L, 3L, 7L, 11L, 11L, 11L, 11L, 4L, 1L, 1L, 1L, 2L, 
2L, 1L, 4L, 8L, 8L, 1L), col3 = c(7L, 11L, 3L, 7L, 11L, 2L, 11L, 
5L, 11L, 11L, 5L, 11L, 11L, 2L, 9L, 9L, 3L, 8L, 11L, 11L, 2L), 
    col4 = c(11L, 11L, 11L, 11L, 6L, 11L, 11L, 11L, 10L, 7L, 
    11L, 2L, 11L, 3L, 11L, 11L, 6L, 11L, 1L, 11L, 11L), col5 = c(11L, 
    1L, 2L, 2L, 11L, 11L, 1L, 10L, 2L, 11L, 1L, 3L, 11L, 11L, 
    8L, 8L, 11L, 11L, 11L, 2L, 9L)), class = "data.frame", row.names = c(NA, 
-21L))

I have a function that compares columns in each group. I would like it to compare not all columns with each other, but only those that I select.

Now the function compares:(2-3;2-4;2-5;2-6;3-4;3-5;3-6;4-5;4-6;5-6)

I want to set this order myself, for example:(2-4;3-5;4-6)

Function:

wilcox.fun <- function(dat) { 
  do.call(rbind, combn(names(dat), 2, function(x) {
    test <- wilcox.test(dat[[x[1]]], dat[[x[2]]], paired=TRUE)
    data.frame(Test = sprintf('Group %s by Group %s', x[1], x[2]), 
               W = round(test$statistic,4), 
               p = test$p.value)
  }, simplify = FALSE))
}


result <- purrr::map_df(split(data[,c(2,3,4,5,6)], data$group), wilcox.fun, .id = 'Group')

Solution

  • You can create a list of vectors with the combination that you are interested in. Change the combn in the function which calculates every combination to lapply which will calculate only for specific combinations that we defined and combine the result with purrr::map_df.

    combination <- list(c(2, 4), c(3, 5), c(4, 6))
    
    wilcox.fun <- function(dat) { 
      do.call(rbind, lapply(combination, function(x) {
        test <- wilcox.test(dat[[x[1]]], dat[[x[2]]], paired=TRUE)
        data.frame(Test = sprintf('Group %s by Group %s', x[1], x[2]), 
                   W = round(test$statistic,4), 
                   p = test$p.value)
      }))
    }
    
    result <- purrr::map_df(split(data, data$group), wilcox.fun, .id = 'Group')