Search code examples
rcombinationslevels

R: all combinations of all lengths from a vector of elements each with 2 conditions


Following up on this question I posted some days ago, I would like to extend the same case for combinations of every length.

So I have a vector of the form:

markers <- LETTERS[1:5]

Originally I just wanted all possible combinations of conditions + and - for all the markers; i.e "lowest hierarchy level" of combinations of 5.

So applying the answer to the above question, I obtained the following:

 [1] "A+/B+/C+/D+/E+" "A-/B+/C+/D+/E+" "A+/B-/C+/D+/E+" "A-/B-/C+/D+/E+" "A+/B+/C-/D+/E+" "A-/B+/C-/D+/E+" "A+/B-/C-/D+/E+"
 [8] "A-/B-/C-/D+/E+" "A+/B+/C+/D-/E+" "A-/B+/C+/D-/E+" "A+/B-/C+/D-/E+" "A-/B-/C+/D-/E+" "A+/B+/C-/D-/E+" "A-/B+/C-/D-/E+"
[15] "A+/B-/C-/D-/E+" "A-/B-/C-/D-/E+" "A+/B+/C+/D+/E-" "A-/B+/C+/D+/E-" "A+/B-/C+/D+/E-" "A-/B-/C+/D+/E-" "A+/B+/C-/D+/E-"
[22] "A-/B+/C-/D+/E-" "A+/B-/C-/D+/E-" "A-/B-/C-/D+/E-" "A+/B+/C+/D-/E-" "A-/B+/C+/D-/E-" "A+/B-/C+/D-/E-" "A-/B-/C+/D-/E-"
[29] "A+/B+/C-/D-/E-" "A-/B+/C-/D-/E-" "A+/B-/C-/D-/E-" "A-/B-/C-/D-/E-"

Now I want to extend this to "upper hierarchy" levels of combinations of 1, 2, 3, and 4 markers. So I would get something like:

"A+"
"A-"
"B+"
"B-"
"C+"
"C-"
...
"A+/B+"
"A-/B+"
"A+/B-"
"A-/B-"
"B+/C+"
"B+/C-"
"B-/C+"
"B-/C-"
...
"A+/B+/C+"
"A-/B+/C+"
...
"A+/B+/C+/D+/E+"
"A-/B+/C+/D+/E+"
"A+/B-/C+/D+/E+"
"A-/B-/C+/D+/E+"
"A+/B+/C-/D+/E+"
...

What would be the fastest optimal way to build on top of the accepted answer to the previous question?

It doesn't have to be done in one shot, it would still be ok (or even better), to get the "inner nodes" from the previous results of groups of 5. Maybe working on the expand.grid intermediate result.

Any idea? Thanks!

EDIT

The best way for my intentions would be to actually keep a place holder for all the markers in the higher hierarchy combinations.

So for example in this case A+/D- would become A+/NA/NA/D-/NA

EDIT 2

Even the first answer to create all the possible n-size combinations (including NA) from scratch is really good... in my real world scenario I have the chance to retrieve a much smaller filtered list of the "lowest hierarchy level" combinations of 5 "markers" that I would be most interested in.

In this scenario, it would be really good to have the option to extract the "upper level nodes" of combinations of 1,2,3,4...n (with NA) from that filtered list (instead of generating all possible n-size combinations from scratch)...

Any idea?


Solution

  • If you still wanted to keep the NA values in there, then just think of it as having a different value than "+" or "-", you just also have the NA value. You could do something like

    markers <- LETTERS[1:5]
    
    test <- expand.grid(lapply(seq(markers), function(x) c("+","-","NA")),stringsAsFactors=FALSE)
    
    apply(test,1,function(x){paste0(ifelse(x=="NA", "NA", markers),ifelse(x=="NA","",x),collapse = "/")})