Search code examples
rlistcombinationsintersectcombn

Intersect all possible combinations of list elements


I have a list of vectors:

> l <- list(A=c("one", "two", "three", "four"), B=c("one", "two"), C=c("two", "four", "five", "six"), D=c("six", "seven"))

> l
$A
[1] "one"   "two"   "three" "four"

$B
[1] "one" "two"

$C
[1] "two"  "four" "five" "six"

$D
[1] "six"   "seven"

I would like to calculate the length of the overlap between all possible pairwise combinations of the list elements, i.e. (the format of the result doesn't matter):

AintB 2
AintC 2
AintD 0
BintC 1
BintD 0
CintD 1

I know combn(x, 2) can be used to get a matrix of all possible pairwise combinations in a vector and that length(intersect(a, b)) would give me the length of the overlap of two vectors, but I can't think of a way to put the two things together.

Any help is much appreciated! Thanks.


Solution

  • combn works with list structures as well, you just need a little unlist'ing of the result to use intersect...

    # Get the combinations of names of list elements
    nms <- combn( names(l) , 2 , FUN = paste0 , collapse = "" , simplify = FALSE )
    
    # Make the combinations of list elements
    ll <- combn( l , 2 , simplify = FALSE )
    
    # Intersect the list elements
    out <- lapply( ll , function(x) length( intersect( x[[1]] , x[[2]] ) ) )
    
    # Output with names
    setNames( out , nms )
    #$AB
    #[1] 2
    
    #$AC
    #[1] 2
    
    #$AD
    #[1] 0
    
    #$BC
    #[1] 1
    
    #$BD
    #[1] 0
    
    #$CD
    #[1] 1