Search code examples
rcombinationspermutationcombn

Generate all unique combinations from a vector with repeating elements


This questions was asked previously but only for vectors with non-repeating elements. I was not able to find an easy solution to get all combinations from a vector with repeating elements. To illustrate I listed an example below.

x <- c('red', 'blue', 'green', 'red', 'green', 'red')

Vector x has 3 repeating elements for 'red' and 2 for 'green'. The expected outcome for all unique combinations would be like this.

# unique combinations with one element
'red'
'blue'
'green'
# unique combination with two elements
'red', 'blue' # same as 'blue','red'
'red', 'green' 
'red', 'red'
'blue', 'green'
'green', 'green'
# unique combination with three elements
'red', 'blue', 'green'
'red', 'red', 'blue'
'red', 'red', 'green'
'red', 'red', 'red' # This is valid because there are three 'red's
'green', 'green', 'red'
'green', 'green', 'blue'
# more unique combinations with four, five, and six elements

Solution

  • Use combn() with lapply() should do the trick.

    x <- c('red', 'blue', 'green', 'red', 'green', 'red')
    
    lapply(1:3, function(y) combn(x, y))
    
    # [[1]]
         # [,1]  [,2]   [,3]    [,4]  [,5]    [,6] 
    # [1,] "red" "blue" "green" "red" "green" "red"
    
    # [[2]]
         # [,1]   [,2]    [,3]  [,4]    [,5]  [,6]    ...
    # [1,] "red"  "red"   "red" "red"   "red" "blue"  ...
    # [2,] "blue" "green" "red" "green" "red" "green" ...
    
    # [[3]]
         # [,1]    [,2]   [,3]    [,4]   [,5]    [,6]    ...
    # [1,] "red"   "red"  "red"   "red"  "red"   "red"   ...
    # [2,] "blue"  "blue" "blue"  "blue" "green" "green" ...
    # [3,] "green" "red"  "green" "red"  "red"   "green" ...
    

    All unique combinations

    lapply(cc, function(y)
      y[,!duplicated(apply(y, 2, paste, collapse="."))]
    )
    
    [[1]]
    [1] "red"   "blue"  "green"
    
    [[2]]
         [,1]   [,2]    [,3]  [,4]    [,5]   [,6]    [,7]   
    [1,] "red"  "red"   "red" "blue"  "blue" "green" "green"
    [2,] "blue" "green" "red" "green" "red"  "red"   "green"
    
    [[3]]
         [,1]    [,2]   [,3]    [,4]    [,5]    [,6]  [,7]    ...
    [1,] "red"   "red"  "red"   "red"   "red"   "red" "blue"  ...
    [2,] "blue"  "blue" "green" "green" "red"   "red" "green" ...
    [3,] "green" "red"  "red"   "green" "green" "red" "red"   ...
    

    Although strictly speaking those aren't all unique combinations, as some of them are permutations of each other.

    Properly unique combinations

    lapply(cc, function(y)
      y[,!duplicated(apply(y, 2, function(z) paste(sort(z), collapse=".")))]
    )
    
    # [[1]]
    # [1] "red"   "blue"  "green"
    
    # [[2]]
         # [,1]   [,2]    [,3]  [,4]    [,5]   
    # [1,] "red"  "red"   "red" "blue"  "green"
    # [2,] "blue" "green" "red" "green" "green"
    
    # [[3]]
         # [,1]    [,2]   [,3]    [,4]    [,5]  [,6]   
    # [1,] "red"   "red"  "red"   "red"   "red" "blue" 
    # [2,] "blue"  "blue" "green" "green" "red" "green"
    # [3,] "green" "red"  "red"   "green" "red" "green"