Search code examples
rapply

Apply function on rows across a column in a list


I need to select a column from a list, cbind the columns, and perform a function on rows of such a combined dataset. I need to do this consequently for all columns. Inspired by the answer here, I came up with a possible solution for one column:

x <- apply(Reduce("cbind", lapply(L, FUN = function(x) x[, 1])), 1, FUN = sd)

It is clunky and gets worse when expanded to include all columns. Let's have a list of matrices:

set.seed(2385737)
L = list(matrix(rnorm(30), ncol = 3), matrix(rnorm(30), ncol = 3), matrix(rnorm(30), ncol = 3))

X <- matrix(c(apply(Reduce("cbind", lapply(L, FUN = function(x) x[, 1])), 1, FUN = sd),
    apply(Reduce("cbind", lapply(L, FUN = function(x) x[, 2])), 1, FUN = sd),
    apply(Reduce("cbind", lapply(L, FUN = function(x) x[, 3])), 1, FUN = sd)),
    ncol = 3
)

I can generalise the code above into:

X <- sapply(1:ncol(L[[1]]), 
    FUN = function(i) apply(Reduce("cbind", 
        lapply(L, FUN = function(x) x[, i])), 1, FUN = sd))

Is there a clean way how to approach the calculation consequently for all columns across a list?


Solution

  • One option would be to stack the list of matrices into a single 3D array and perform the calculations directly on this array using apply. Although rarely used, the MARGIN argument in apply can be fed a vector of margin indices that allows calculations to be done on any dimension of the array, so using MARGIN = c(1, 2) will perform the FUN on the vectors along the third dimension.

    This allows the whole thing to be done as a one-liner if you use the function abind from the abind package to create the array from your list.

    apply(do.call(abind::abind, c(L, along = 3)), c(1, 2), FUN = sd)
    #>            [,1]      [,2]      [,3]
    #>  [1,] 0.5040136 0.1593154 0.9371359
    #>  [2,] 1.2781308 0.5380104 1.1967232
    #>  [3,] 1.3355753 0.5445188 0.8851976
    #>  [4,] 1.5333570 0.9800276 0.5928828
    #>  [5,] 1.4844418 2.1511425 1.6904784
    #>  [6,] 1.5158726 2.0156800 1.3566559
    #>  [7,] 0.8452233 0.3058013 1.0896865
    #>  [8,] 0.5742021 0.8816770 1.4622064
    #>  [9,] 1.7673249 0.9863849 1.1386831
    #> [10,] 0.9001773 1.0793596 0.5754467
    

    This is the same result as X in your example above.

    If you prefer to use base R without extra packages, you can create your array directly:

    apply(array(unlist(L), c(nrow(L[[1]]), ncol(L[[1]]), length(L))), c(1, 2), sd)
    

    This gives the same result.