Search code examples
runiquecombinationssample

Number of combinations with restrictions in R


Dear R and stats community,

How many combinations are possible to select 3 objects from a total of 9 objects in a way that 3 selected objects do not repeat themselves in one external characteristic?

In my example, three selected specimens must always differ in their species attribute. That is, in the selection of 3, there should always be one representative of species A, one of species B and one of species C, and the order within this selection does not matter.

> mydata <- data.frame(specimens = paste("s", 1:9, sep = ""), species = LETTERS[1:3])
> mydata
  specimens species
1        s1       A
2        s2       B
3        s3       C
4        s4       A
5        s5       B
6        s6       C
7        s7       A
8        s8       B
9        s9       C

How many combinations are there? I know how to count the combinations of ANY set of three objects with e.g. arrangements::ncombinations(9,3) or choose(9,3).

I have seen that one way for generating combinations that allows pasting a custom function which could help in selecting combinations with required properties.

library(utils)
combn(mydata$specimens, 3, FUN)

I am not able to design such a function by myself. One of the unsuccessful trials is

library(utils)
outp <- combn(as.character(mydata$specimens), 3, function(x) !duplicated(mydata$species))
> outp[, 1:10]
       [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
 [1,]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 [2,]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 [3,]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 [4,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [5,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [6,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [7,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [8,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [9,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Thanks in advance for your help.


Solution

  • There has to be an A, a B and a C. There are 3 possible As. Each of these must appear with one of the Bs, so there are 3 * 3 = 9 possible pairs of As and Bs. Each of these pairs can be grouped with one of the 3 Cs, so there are 3 * 3 * 3 = 27 possible combinations.

    You can see them all using this one-liner:

    expand.grid(split(mydata$specimens, mydata$species))
    #>     A  B  C
    #> 1  s1 s2 s3
    #> 2  s4 s2 s3
    #> 3  s7 s2 s3
    #> 4  s1 s5 s3
    #> 5  s4 s5 s3
    #> 6  s7 s5 s3
    #> 7  s1 s8 s3
    #> 8  s4 s8 s3
    #> 9  s7 s8 s3
    #> 10 s1 s2 s6
    #> 11 s4 s2 s6
    #> 12 s7 s2 s6
    #> 13 s1 s5 s6
    #> 14 s4 s5 s6
    #> 15 s7 s5 s6
    #> 16 s1 s8 s6
    #> 17 s4 s8 s6
    #> 18 s7 s8 s6
    #> 19 s1 s2 s9
    #> 20 s4 s2 s9
    #> 21 s7 s2 s9
    #> 22 s1 s5 s9
    #> 23 s4 s5 s9
    #> 24 s7 s5 s9
    #> 25 s1 s8 s9
    #> 26 s4 s8 s9
    #> 27 s7 s8 s9