I have a vector like this:
v <- c("v1" = 3, "v2" = 1, "v3" = 2, "v4" = 1, "v5" = 2, "v6" = 4,
"v7" = 1, "v8" = 4, "v9" = 1, "v10" = 4, "v11" = 3, "v12" = 3)
The names are names of variables and the numbers are group assignments. For example, group 2 contains the variables v3 and v5 (v[which(v==2)]
)
I want to create all combinations, which include one variable of each group. Variables from the same group should not appear in any combination.
Order does not matter, e.g. v1,v3,v7,v8 == v3,v1,v7,v8
. Only one of these two should be included.
I came up with a loop solution:
combs <- combn(names(v), 4, simplify=F)
lis_combs_cleaned <- list()
for (i in 1:length(combs)){
comb_i <- combs[[i]]
if(sum(duplicated(v[comb_i])) == 0){
lis_combs_cleaned[[as.character(i)]] <- comb_i
}else{
next
}
}
asDF <- do.call("rbind", lis_combs_cleaned)
That seems to work. However, my actual use case contains many more combinations in combs
and it is not really feasible to loop through them.
So I was wondering if anybody has an idea for a more efficient solution?
We can split the list by group and then use that list with expand.grid
to get all combinations
do.call("expand.grid", split(v, v) |> lapply(names))
# 1 2 3 4
# 1 v2 v3 v1 v6
# 2 v4 v3 v1 v6
# 3 v7 v3 v1 v6
# 4 v9 v3 v1 v6
# ....