Search code examples
rarraysdivision

Multidimensional STAT in sweep function of R


How can I sweep the first two dimensions of an array based on the first columns of third dimension?

A simplified example:

My array:

a <- array(1:24,c(4,3,2))
> a
, , 1

     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12

, , 2

     [,1] [,2] [,3]
[1,]   13   17   21
[2,]   14   18   22
[3,]   15   19   23
[4,]   16   20   24

I would like to divide each matrix by its first column by sweep (or another) function. I would like to put the following output in a single array by using a single sweep function.

> sweep(a[,,1], 1, a[,,1][,1], "/")
     [,1]     [,2]     [,3]
[1,]    1 5.000000 9.000000
[2,]    1 3.000000 5.000000
[3,]    1 2.333333 3.666667
[4,]    1 2.000000 3.000000

> sweep(a[,,2], 1, a[,,2][,1], "/")
     [,1]     [,2]     [,3]
[1,]    1 1.307692 1.615385
[2,]    1 1.285714 1.571429
[3,]    1 1.266667 1.533333
[4,]    1 1.250000 1.500000

How can I achieve this?


Solution

  • Using a simple for loop.

    res <- array(dim=dim(a))
    for (i in seq_len(dim(a)[3])) res[, , i] <- a[, , i]/a[, 1, i]
    
    res
    # , , 1
    # 
    #      [,1]     [,2]     [,3]
    # [1,]    1 5.000000 9.000000
    # [2,]    1 3.000000 5.000000
    # [3,]    1 2.333333 3.666667
    # [4,]    1 2.000000 3.000000
    # 
    # , , 2
    # 
    #      [,1]     [,2]     [,3]
    # [1,]    1 1.307692 1.615385
    # [2,]    1 1.285714 1.571429
    # [3,]    1 1.266667 1.533333
    # [4,]    1 1.250000 1.500000
    

    This appears to outperform *apply functions.

    a <- array(rnorm(1e3), c(4, 3, 20000))
    microbenchmark::microbenchmark(sapply=array(sapply(seq_len(dim(a)[3]), \(x) a[, , x] / a[, 1, x]), dim=dim(a)),
                                   vapply=array(vapply(seq_len(dim(a)[3]), \(x) a[, , x] / a[, 1, x], vector('numeric', prod(dim(a)[1:2]))), dim=dim(a)),`for`={res <- array(dim=dim(a))
                                   for (i in seq_len(dim(a)[3])) res[, , i] <- a[, , i] / a[, 1, i]},
                                   apply=apply(a, 3, function(x) x / x[,1]))
    # Unit: milliseconds
    #   expr       min        lq      mean    median        uq      max neval cld
    # sapply  73.22494  78.46897  88.27141  86.18902  90.08711 232.8461   100  b 
    # vapply  72.03503  78.24985  86.89068  84.87590  90.58196 226.0209   100  b 
    #    for  58.87842  64.48287  74.33136  70.23162  77.25822 209.9818   100 a  
    #  apply 110.48710 118.79229 130.51282 124.47029 134.64933 294.0801   100   c