I would like to do a permutation of variables with all arithmetic functions in R.
example:
testvector <- c("cat", "dog")
# expected outcome
c("cat","dog","cat+dog","cat-dog","dog-cat","cat*dog","cat/dog","dog/cat")
I have approximately 10 variables, so this is difficult to do by hand. I found a response in python that can maybe do the same, but I have to do this in R.
( How to perform all possible combinations of arithmetic operations on 3 integers? )
I want to use this in kind of a manual machine learning approach to find the best combination of variables to separate a dataset.
If an ML package can do that for me, I'd also be happy to try it.
I also want to use the results of this vector in linear models, but maybe there is a more straight forward way.
One more thing, I also would like to - if possible - to include brackets to group variables
# incomplete example
testvector <- c("cat","dog","bird")
# expected outcome looks like
c("(cat-dog)/bird","(bird+cat)/dog","(dog+bird)*cat")
It would be ok if this was not for character vectors, but for numeric and I would pre-calculate the necessary variables beforehand with the permutations and then use the results for modelling.
EDIT:
I edited the first comment to result in this (and it's a good start), but it would be nicer if a) the length of the testvector could be incorporated so I do not have to adjust the formula by hand and b) if "nonsensical" elements could be removed. The latter is not that important I think as I can program the loop to skip elements that do not result in a real number.
testvector <- c("dog","cat","bird")
testvector <- c(paste0(testvector,")"), paste0("(",testvector), testvector)
c(testvector, do.call(paste, c(subset(expand.grid(testvector
,c("+", "-", "*", "/")
, testvector
,c("+", "-", "*", "/")
, testvector)
), sep = "")))
#head of output:
[1] "dog)" "cat)" "bird)" "(dog" "(cat"
[6] "(bird" "dog" "cat" "bird" "cat)+dog)+dog)"
[11] "bird)+dog)+dog)" "(dog+dog)+dog)" "(cat+dog)+dog)" "(bird+dog)+dog)" "dog+dog)+dog)"
[16] "cat+dog)+dog)" "bird+dog)+dog)" "cat)-dog)+dog)" "bird)-dog)+dog)" "(dog-dog)+dog)"
[21] "(cat-dog)+dog)" "(bird-dog)+dog)" "dog-dog)+dog)" "cat-dog)+dog)" "bird-dog)+dog)"
[26] "cat)*dog)+dog)" "bird)*dog)+dog)" "(dog*dog)+dog)" "(cat*dog)+dog)" "(bird*dog)+dog)"
[31] "dog*dog)+dog)" "cat*dog)+dog)" "bird*dog)+dog)" "cat)/dog)+dog)" "bird)/dog)+dog)"
We could use expand.grid
to get the combinations and then paste
the rows
c(testvector, do.call(paste, c(subset(expand.grid(testvector,
c("+", "-", "*", "/"), testvector), Var1 != Var3), sep = "")))
-output
[1] "cat" "dog" "dog+cat" "dog-cat" "dog*cat" "dog/cat" "cat+dog" "cat-dog" "cat*dog" "cat/dog"