I would like to generate as many data frames as the number of permutation of my columns, given that one column as to remain unpermutated (keep the same index position in all generated data frames). Here is the main dataframe:
data1 <- data.frame("Alpha"=c(1,2), "Beta"=c(2,2), "Gamma"=c(4,8), "Delta"=c(22,3))
data1
Alpha Beta Gamma Delta
1 1 2 4 22
2 2 2 8 3
Assume the 3rd colomn (Gamma) must keep its position, for a limited number of permutations, it is easy to use the column index and permute them manually like this:
data2 <- data1[c(1,4,3,2)]
data2
Alpha Delta Gamma Beta
1 1 22 4 2
2 2 3 8 2
and so on until all permutations of 3 out of 4 columns are reached:
data3 <- data1[c(4,1,3,2)]
data4 <- data1[c(4,2,3,1)]
data5 <- data1[c(2,4,3,1)]
data6 <- data1[c(2,1,3,4)]
data7...
It is inefficient and a nightmare with a large dataset. How to generate all data frames quickly without typing all permutations manually? I think permn
or combn
are usefull but I am unable to go any further.
If you want all permutations where column 3 is still column 3 then you can do as follows
data1 <- data.frame("Alpha"=c(1,2), "Beta"=c(2,2), "Gamma"=c(4,8), "Delta"=c(22,3))
library(combinat)
idx <- permn(ncol(data1))
idx <- idx[sapply(idx, "[", i = 3) == 3]
res <- lapply(idx, function(x) data1[x])
res
#R> [[1]]
#R> Alpha Beta Gamma Delta
#R> 1 1 2 4 22
#R> 2 2 2 8 3
#R>
#R> [[2]]
#R> Delta Alpha Gamma Beta
#R> 1 22 1 4 2
#R> 2 3 2 8 2
#R>
#R> [[3]]
#R> Alpha Delta Gamma Beta
#R> 1 1 22 4 2
#R> 2 2 3 8 2
#R>
#R> [[4]]
#R> Beta Delta Gamma Alpha
#R> 1 2 22 4 1
#R> 2 2 3 8 2
#R>
#R> [[5]]
#R> Delta Beta Gamma Alpha
#R> 1 22 2 4 1
#R> 2 3 2 8 2
#R>
#R> [[6]]
#R> Beta Alpha Gamma Delta
#R> 1 2 1 4 22
#R> 2 2 2 8 3
If you want the objects to be in the global enviroments called data2
, ...., data6
then call
names(res) <- paste0("data", 1:length(res))
list2env(res, .GlobalEnv)
data1
#R> Alpha Beta Gamma Delta
#R> 1 1 2 4 22
#R> 2 2 2 8 3
data2
#R> Delta Alpha Gamma Beta
#R> 1 22 1 4 2
#R> 2 3 2 8 2
ls() # all the objects in your global enviroment
#R> [1] "data1" "data2" "data3" "data4" "data5" "data6" "idx" "res"