Search code examples
runiquecombinations

R - Select unique combinations of values from within the same column


Assuming a data frame in the following structure:

ColA  ColB
A     1
A     2
A     4
B     3
B     2
B     4
C     1
C     1
C     1

Is it possible to select all combinations of A, B and C that sum to 7 (using each of A, B & C once), for example as follows:

ColA  Combination1  Combination2
A     2             4
B     4             2
C     1             1

Alternatively, would it be better to loop through all unique combinations of A, B & C to determine their total and then column bind them? If so, how would I loop through the unique combinations of A, B & C assuming data in the structure of the first data frame?


Solution

  • Here is my suggestion:

    df <- read.table( header =TRUE,
                      text = "ColA  ColB
                              A     1
                              A     2
                              A     4
                              B     3
                              B     2
                              B     4
                              C     1
                              C     1
                              C     1")
    
    X <- tapply(df$ColB, df$ColA, unique)
    G <- expand.grid(X)
    G <- G[rowSums(G)==7, ]
    
    Combinations <- data.frame(t(G))
    colnames(Combinations) <- paste0("Comb_", 1:ncol(Combinations))
    

    The list X contains the unique values of A,B, and C:

    X
    #$A
    #[1] 1 2 4
    
    #$B
    #[1] 3 2 4
    
    #$C
    #[1] 1
    

    Then expand.grid builds the cartesian product G, i.e. the combinations of these unique values:

    G
    #  A B C
    #1 1 3 1
    #2 2 3 1
    #3 4 3 1
    #4 1 2 1
    #5 2 2 1
    #6 4 2 1
    #7 1 4 1
    #8 2 4 1
    #9 4 4 1
    

    After G <- G[rowSums(G)==7,] only the combinations with a total sum of 7 are left:

    G
    #  A B C
    #6 4 2 1
    #8 2 4 1
    

    Finally the data in the matrix G is put into the data.frame Combinations:

    Combinations
    #  Comb_1 Comb_2
    #A      4      2
    #B      2      4
    #C      1      1