Search code examples
ralgorithmpermutation

Counting existing permutations in R


I have a large dataset with columns IDNum, Var1, Var2, Var3, Var4, Var5, Var6. The variables are boolean with value either 0 or 1. Each row could be one of 64 different possible permutations. I would like to count the number of rows corresponding to each permutation present. Is there an efficient way to write this in R?


Solution

  • aggregate can do this. Here's a shorter example:

    r <- function() rbinom(10, 1, .5)
    d <- data.frame(IDNum=1:10, Var1=r(), Var2=r())
    d
       IDNum Var1 Var2
    1      1    0    1
    2      2    0    1
    3      3    0    0
    4      4    1    0
    5      5    1    1
    6      6    0    0
    7      7    1    1
    8      8    1    0
    9      9    0    1
    10    10    0    1
    

    Now to count the number of each combination:

    > aggregate(d$IDNum, d[-1], FUN=length)
      Var1 Var2 x
    1    0    0 2
    2    1    0 2
    3    0    1 4
    4    1    1 2
    

    The values in d$IDNum aren't actually used here, but something must be passed to the length function. The values in d$IDNum for each combination are passed to length to get the count.