Search code examples
rmathcombinationspermutation

How to get all possible combinations of a vector in R with all possible arithmetic functions?


I would like to do a permutation of variables with all arithmetic functions in R.

example:

testvector <- c("cat", "dog")

# expected outcome

c("cat","dog","cat+dog","cat-dog","dog-cat","cat*dog","cat/dog","dog/cat")

I have approximately 10 variables, so this is difficult to do by hand. I found a response in python that can maybe do the same, but I have to do this in R.

( How to perform all possible combinations of arithmetic operations on 3 integers? )

I want to use this in kind of a manual machine learning approach to find the best combination of variables to separate a dataset.

If an ML package can do that for me, I'd also be happy to try it.

I also want to use the results of this vector in linear models, but maybe there is a more straight forward way.

One more thing, I also would like to - if possible - to include brackets to group variables

# incomplete example

testvector <- c("cat","dog","bird")

# expected outcome looks like

c("(cat-dog)/bird","(bird+cat)/dog","(dog+bird)*cat")

It would be ok if this was not for character vectors, but for numeric and I would pre-calculate the necessary variables beforehand with the permutations and then use the results for modelling.

EDIT:

I edited the first comment to result in this (and it's a good start), but it would be nicer if a) the length of the testvector could be incorporated so I do not have to adjust the formula by hand and b) if "nonsensical" elements could be removed. The latter is not that important I think as I can program the loop to skip elements that do not result in a real number.


testvector <- c("dog","cat","bird")

testvector <- c(paste0(testvector,")"), paste0("(",testvector), testvector)

c(testvector, do.call(paste, c(subset(expand.grid(testvector 
                                                  ,c("+", "-", "*", "/")
                                                  , testvector
                                                  ,c("+", "-", "*", "/")
                                                  , testvector)
                                      ), sep = "")))

#head of output:

[1] "dog)"             "cat)"             "bird)"            "(dog"             "(cat"            
   [6] "(bird"            "dog"              "cat"              "bird"             "cat)+dog)+dog)"  
  [11] "bird)+dog)+dog)"  "(dog+dog)+dog)"   "(cat+dog)+dog)"   "(bird+dog)+dog)"  "dog+dog)+dog)"   
  [16] "cat+dog)+dog)"    "bird+dog)+dog)"   "cat)-dog)+dog)"   "bird)-dog)+dog)"  "(dog-dog)+dog)"  
  [21] "(cat-dog)+dog)"   "(bird-dog)+dog)"  "dog-dog)+dog)"    "cat-dog)+dog)"    "bird-dog)+dog)"  
  [26] "cat)*dog)+dog)"   "bird)*dog)+dog)"  "(dog*dog)+dog)"   "(cat*dog)+dog)"   "(bird*dog)+dog)" 
  [31] "dog*dog)+dog)"    "cat*dog)+dog)"    "bird*dog)+dog)"   "cat)/dog)+dog)"   "bird)/dog)+dog)" 



Solution

  • We could use expand.grid to get the combinations and then paste the rows

    c(testvector, do.call(paste, c(subset(expand.grid(testvector, 
      c("+", "-", "*", "/"), testvector), Var1 != Var3), sep = "")))
    

    -output

    [1] "cat"     "dog"     "dog+cat" "dog-cat" "dog*cat" "dog/cat" "cat+dog" "cat-dog" "cat*dog" "cat/dog"