Search code examples
rcross-product

Tally of pair combinations for multiple variables


I've got a dataset where each column has 4 binary variables. How do i create 4 x 4 grid with the tally of each pair combination of the variables?

Here's an example data frame:

Person <- c("Bob", "Jim", "Sarah", "Dave")
A <- c(1,0,1,1)
B <- c(1,1,1,0)
C <- c(0,0,0,1)
D <- c(1,0,0,0)

So in the 4x4 grid, the intersection of A and B would have a 2 because Bob and Sarah have 1 for A and B.


Solution

  • For two vectors A and B it will be a cross product:

    res <- A %*% B or res <- crossprod(A, B)

    to make a matrix of all combinations use two level for or apply:

    data <- list(A,B,C,D)
    res <- matrix(NA, nrow = n, ncol = m, dimnames = dimnames(product.m))
    
    for(i in 1:n) {
      for(j in 1:i) {
        res[i,j] <- crossprod(data[[i]], data[[j]])
      }
    }
    

    Here I fill only one half of the matrix. You then can copy the values across like this:

    res[upper.tri(res)] <- t(res)[upper.tri(res)]