Search code examples
rpermutationvariable-length

Generating all permutations when length varies


Background: I am working with a qualitative data coding scheme that contains seven ordered levels of codes. Five of these contain a single option and two contain two mutually exclusive options. A given code can be a concatenation of up to seven component codes, but they must occur in the order of the levels (thus we have permutations rather than combinations). The hard part is that a code may contain any number of levels, 1-7.

Level 1 : A
Level 2 : B or C
Level 3 : D or E
Level 4 : F
Level 5 : G
Level 6 : H
Level 7 : I

Equally valid example codes : ABDFGHI, ACF, I, FGHI, ACE, FH

Issue: I need to create a list of all valid codes, but am struggling with strategy since the permutations can be of any length and I cannot find relevant existing questions posed here. My initial intent was to use R but any way I could get a complete list is welcome. Any pointers out there?


Solution

  • I am not sure exactly how you need your output, but this works. Assign each level to a variable, but add a NA to it. Then use expand.grid like so:

    L1<-c("A",NA)
    L2<-c("B","C",NA)
    L3<-c("D","E",NA)
    L4<-c("F",NA)
    L5<-c("G",NA)
    L6<-c("H",NA)
    L7<-c("I",NA)
    expand.grid(L1=L1,L2=L2,L3=L3,L4=L4,L5=L5,L6=L6,L7=L7)
    

    Each row of the output will be a combination, but it will include NA for the variables that are not included. Note that 288, the last row, is all NA.

    Note, to get a row without the NA you could do (using row 283 as an example):

    Levels<-expand.grid(L1=L1,L2=L2,L3=L3,L4=L4,L5=L5,L6=L6,L7=L7)
    Levels[283,][!is.na(Levels[283,])]