Search code examples
rdataframepermute

Combine all possible rows of data frame in R


I have the following data frame:

x <- data.frame("Col1" = c('A', 'B', 'C', 'D'), "Col2" = c('W', 'X', 'Y', 'Z'))

I want to have a new data frame with all possible combinations of row combinations, which would give a data frame that would have two columns containing something like:

A W
A X
A Y
A Z
B W
B X
B Y
B Z
C W
...

The dataframe would always have two columns but number of rows could vary.

I looked at permute() or sample() but I did not manage to get what I am looking for. Thanks!


Solution

  • tidyr::complete() is designed for this. I'm surprised I don't see a vanilla example on SO.

    library(magrittr)
    x %>% 
      tidyr::complete(Col1, Col2)
    

    Result:

    # A tibble: 16 x 2
       Col1  Col2 
       <fct> <fct>
     1 A     W    
     2 A     X    
     3 A     Y    
     4 A     Z    
     5 B     W    
     6 B     X    
     7 B     Y    
     8 B     Z    
     9 C     W    
    10 C     X    
    11 C     Y    
    12 C     Z    
    13 D     W    
    14 D     X    
    15 D     Y    
    16 D     Z    
    

    If your real-world scenario is as simple as the OP, @bouncyball's suggestion of expand.grid(x) is the cleanest. If your real-world scenario has more complexity, then tidyr::complete() might allow you to grow more easily. I commonly have more than the two ID variables to expand/complete. These are typically the analyses' dependent/outcome variables, and the fill parameter allows you to specify their default value for combinations that don't appear in the observed dataset. Here's an SO example.

    edited to reflect advice of @bouncyball and @ADuv.