Search code examples
rmatrixcross-product

R - Given a matrix and a power, produce multiple matrices containing all unique combinations of matrix columns


Based on my related question linked below (see @Aleh solution): I am looking to calculate only unique products between columns in a matrix for a given power.

E.g., for N=5,M=3, p=2, we get the product of columns (1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3). I want to modify (@Aleh's) code to only calculate products between columns (1,1), (1,2), (1,3), (2,2), (2,3), (3,3). But I would want to do this for each p-th order.

Can someone help me accomplish this in R?

Many thanks in advance!

Related questions question: R - Given a matrix and a power, produce multiple matrices containing all combinations of matrix columns


Solution

  • If I understand you correctly, then this is what you are looking for:

    # all combinations of p elements out of M with repetiton 
    # c.f. http://www.mathsisfun.com/combinatorics/combinations-permutations.html
    comb_rep <- function(p, M) {
      combn(M + p - 1, p) - 0:(p - 1)
    }
    
    # use cols from mat to form a new matrix
    # take row products
    col_prod <- function(cols, mat) {
      apply(mat[ ,cols], 1, prod)
    }
    
    N <- 5
    M <- 3
    p <- 3
    mat <- matrix(1:(N*M),N,M)
    
    col_comb <- lapply(2:p, comb_rep, M)
    col_comb
    #> [[1]]
    #>      [,1] [,2] [,3] [,4] [,5] [,6]
    #> [1,]    1    1    1    2    2    3
    #> [2,]    1    2    3    2    3    3
    #> 
    #> [[2]]
    #>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
    #> [1,]    1    1    1    1    1    1    2    2    2     3
    #> [2,]    1    1    1    2    2    3    2    2    3     3
    #> [3,]    1    2    3    2    3    3    2    3    3     3
    
    # prepend original matrix
    res_mat <- list()
    res_mat[[1]] <- mat
    c(res_mat, 
      lapply(col_comb, function(cols) apply(cols, 2, col_prod, mat)))
    #> [[1]]
    #>      [,1] [,2] [,3]
    #> [1,]    1    6   11
    #> [2,]    2    7   12
    #> [3,]    3    8   13
    #> [4,]    4    9   14
    #> [5,]    5   10   15
    #> 
    #> [[2]]
    #>      [,1] [,2] [,3] [,4] [,5] [,6]
    #> [1,]    1    6   11   36   66  121
    #> [2,]    4   14   24   49   84  144
    #> [3,]    9   24   39   64  104  169
    #> [4,]   16   36   56   81  126  196
    #> [5,]   25   50   75  100  150  225
    #> 
    #> [[3]]
    #>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
    #> [1,]    1    6   11   36   66  121  216  396  726  1331
    #> [2,]    8   28   48   98  168  288  343  588 1008  1728
    #> [3,]   27   72  117  192  312  507  512  832 1352  2197
    #> [4,]   64  144  224  324  504  784  729 1134 1764  2744
    #> [5,]  125  250  375  500  750 1125 1000 1500 2250  3375
    

    It is not really efficient, though, since e.g. the third power is calculated from three columns of the original matrix instead of one column of the original matrix and one column of the second power.

    Edit: testing with realistic sizes mentioned in the comments shows that @Moody_Mudskipper's approach for the multiplication is much faster, while my approach for the combinations is a bit faster. So it makes sense to combine the two:

    # original function from @Moody_Mudskipper's answer
    fun <- function(mat,p) {
      mat <- as.data.frame(mat)
      combs <- do.call(expand.grid,rep(list(seq(ncol(mat))),p)) # all combinations including permutations of same values
      combs <- combs[!apply(combs,1,is.unsorted),]              # "unique" permutations only
      rownames(combs) <- apply(combs,1,paste,collapse="-")      # Just for display of output, we keep info of combinations in rownames
      combs <- combs[order(rownames(combs)),]                   # sort to have desired column order on output
      apply(combs,1,function(x) Reduce(`*`,mat[,x]))            # multiply the relevant columns
    }
    combined <- function(mat, p) {
      mat <- as.data.frame(mat)
      combs <- combn(ncol(mat) + p - 1, p) - 0:(p - 1)          # all combinations with repetition
      colnames(combs) <- apply(combs, 2, paste, collapse = "-") # Just for display of output, we keep info of combinations in colnames
      apply(combs, 2, function(x) Reduce(`*`, mat[ ,x]))        # multiply the relevant columns
    }
    N <- 10000
    M <- 25
    p <- 4
    mat <- matrix(runif(N*M),N,M)
    microbenchmark::microbenchmark(
      fun(mat, p),
      combined(mat, p),
      times = 10
    )
    #> Unit: seconds
    #>              expr      min       lq     mean   median       uq      max neval
    #>       fun(mat, p) 3.456853 3.698680 4.067995 4.032647 4.341944 4.869527    10
    #>  combined(mat, p) 2.543994 2.738313 2.870446 2.793768 3.090498 3.254232    10
    

    Note that the two functions do not yield the same results for M > 9 since the column ordering is different due to the lexical sorting with 1-10 < 1-2 employed in fun. The results would be identical if one inserted the same lexical sorting in combined.