Search code examples
rgroupingcombinationscombinatorics

Determining All Combinations but With a Grouping Variable


I have the following list of data.

Input <- list(c("1", "2"), c("3", "4"), c("5", "6", "7"))

I want to take one item from each list element and combine them into a vector. Then, from the remaining items in each list element, I'd like to repeat this process, taking another item from each of these list elements and combining them into another vector. I'd like to repeat these steps until I reach some predetermined value (which is 2 in this case; 2 is the maximum number since it happens to be the minimum length of each list element in the Input list).

There are many possible ways to do this, and I'm hoping to find a way to do it that can return every possibility, like this Output list below. I don't really care about the form of the output as long as it contains the same information.

Output <- lapply(list(rbind(as.character(c(1, 3, 5)), as.character(c(2, 4, 6))), rbind(as.character(c(1, 3, 5)), as.character(c(2, 4, 7))), rbind(as.character(c(1, 3, 6)), as.character(c(2, 4, 5))), rbind(as.character(c(1, 3, 6)), as.character(c(2, 4, 7))), rbind(as.character(c(1, 3, 7)), as.character(c(2, 4, 5))), rbind(as.character(c(1, 3, 7)), as.character(c(2, 4, 6))), rbind(as.character(c(1, 4, 5)), as.character(c(2, 3, 6))), rbind(as.character(c(1, 4, 5)), as.character(c(2, 3, 7))), rbind(as.character(c(1, 4, 6)), as.character(c(2, 3, 5))), rbind(as.character(c(1, 4, 6)), as.character(c(2, 3, 7))), rbind(as.character(c(1, 4, 7)), as.character(c(2, 3, 5))), rbind(as.character(c(1, 4, 7)), as.character(c(2, 3, 6)))), function (x) {
  lapply(as.data.frame(t(x)), function (y) {
    y
  })
})

This example is pretty tiny. In practice, I'll probably have more groups (list elements in the Input list) and more elements within each of these groups, and the sizes of the groups may be different like they were in my example. Is there an efficient, programmatic way of doing this operation? I'd love to see a solution using base functions but I'm open to anything. The expand.grid() function doesn't work since it doesn't take into account the grouping variable I have.


Solution

  • You could try

    lst <- expand.grid(Input)
    minlen <- min(lengths(Input))
    res <- Filter(
        length,
        combn(
            1:nrow(lst),
            minlen,
            function(x) {
                if (all(!apply(lst[x, ], 2, anyDuplicated))) {
                    lst[x, ]
                }
            },
            simplify = FALSE
        )
    )
    

    which gives

    > res
    [[1]]
      Var1 Var2 Var3
    1    1    3    5
    8    2    4    6
    
    [[2]]
       Var1 Var2 Var3
    1     1    3    5
    12    2    4    7
    
    [[3]]
      Var1 Var2 Var3
    2    2    3    5
    7    1    4    6
    
    [[4]]
       Var1 Var2 Var3
    2     2    3    5
    11    1    4    7
    
    [[5]]
      Var1 Var2 Var3
    3    1    4    5
    6    2    3    6
    
    [[6]]
       Var1 Var2 Var3
    3     1    4    5
    10    2    3    7
    
    [[7]]
      Var1 Var2 Var3
    4    2    4    5
    5    1    3    6
    
    [[8]]
      Var1 Var2 Var3
    4    2    4    5
    9    1    3    7
    
    [[9]]
       Var1 Var2 Var3
    5     1    3    6
    12    2    4    7
    
    [[10]]
       Var1 Var2 Var3
    6     2    3    6
    11    1    4    7
    
    [[11]]
       Var1 Var2 Var3
    7     1    4    6
    10    2    3    7
    
    [[12]]
      Var1 Var2 Var3
    8    2    4    6
    9    1    3    7