I'm trying to identify common elements across multiple vectors, with all combinations possible. I had previously tried this one here, but it doesn't quite work out because it only retrieves the common elements between 2 groups.
Take this example: I have 10 vectors (varying in number of elements) that may have common elements with one or more other vectors. It is also possible that some elements are exclusive to some groups. As an example, here is the data:
#Creating a mock example: 10 groups, with varying number of elements:
set.seed(753)
for (i in 1:10){
assign(paste0("grp_",i), paste0("target_", sample(1:40, sample(20:34))))
}
Simply put, I want to do something analogous to a Venn diagram, but put into a data frame/matrix with the counts, instead. Something like this (note that here, I am just adding a snapshot of random parts of how the result data frame/matrix should look like):
grp1 grp2 grp3 grp4 grp1.grp4.grp5.grp8.grp10
grp1 - 16 12 20 5
grp2 16 - 10 20 4
grp3 12 10 - 16 3
grp4 20 20 16 - 5
grp1.grp4.grp5.grp8.grp10 5 4 3 5 10
grp1.grp2.grp3.grp4.grp5.grp6.grp7.grp8.grp9.grp10 0 0 0 0 0
grp1.grp2.grp3.grp4.grp5.grp6.grp7.grp8.grp9.grp10
grp1 3
grp2 6
grp3 4
grp4 1
grp1.grp4.grp5.grp8.grp10 5
grp1.grp2.grp3.grp4.grp5.grp6.grp7.grp8.grp9.grp10 2
From the table above, please also note that counts that have the same row and column names mean that they are exclusive to that particular group (e.g. count on row1/col1 means that there are 88 exclusive elements).
Any help is very much appreciated!
EDIT: the real counts for the expected final matrix has now been added.
Ok, if I understood all well, lets give it a try. Note that I added your sample data in a list, so we can index them to intersect.
set.seed(753)
grps <- list()
for (i in 1:10){
grps[i] <- list(paste0("target_", sample(1:40, sample(20:34))))
}
You want all 10 groups resulting in 1023 x 1023 combinations Making it flexible makes testing a bit easier ;) The key here is I keep them as list with integers that we can index in grps.
N <- 10
combinations <- unlist(sapply(1:N, function(n) combn(1:N, n, simplify = F)), recursive = F)
Now we have to loop twice over your combinations as you compare each 1023 x 1023 combinations with their intersects. The use of sapply gives us the nice 1023 x 1023 matrix you want.
results <- sapply(seq_along(combinations), function(i) {
sapply(seq_along(combinations), function(j) {
length(intersect(
Reduce(intersect, grps[combinations[[i]]]),
Reduce(intersect, grps[combinations[[j]]])
))
})
})
Now we create the names as shown in your example, they are based on the combinations we created and used earlier.
names <- sapply(combinations, function(x) paste("grp", x, sep = "", collapse = "."))
Create the colnames and rownames of the matrix
colnames(results) <- rownames(results) <- names
Seems in your output you want to values for the diagonals, so we change that to NA
diag(results) <- NA