Search code examples
arraysrmatrixaggregate

Aggregate an array based on the dimension names


I am trying to aggregate an array based on the dimension name, in an efficient way.

ex_array <- array(1:10000, dim = c(100, 10, 10),
                  dimnames = list(Col1 = c(rep(10,50), rep(20, 50)),
                                  Col2 = 1:10,
                                  Col3 = 1:10))

Now I want to aggregate this array based on the names of the first dimension. This dimension has 2 different names (10 and 20) and therefore the new array should have a dimension of 2 by 10 by 10. All the values with dimension name 1 equal to 10 should be summed and the values with dimension 1 equal to 20 should be summed.

Is there some clever way of doing this?


Solution

  • reshape2. I think reshape2 is the best fit here, if you're willing to use packages:

    library(reshape2)
    res = acast(melt(ex_array), Col1 ~ Col2 ~ Col3, fun.aggregate = sum)
    
    str(res)
    #  int [1:2, 1:10, 1:10] 1275 3775 6275 8775 11275 13775 16275 18775 21275 23775 ...
    #  - attr(*, "dimnames")=List of 3
    #   ..$ : chr [1:2] "10" "20"
    #   ..$ : chr [1:10] "1" "2" "3" "4" ...
    #   ..$ : chr [1:10] "1" "2" "3" "4" ...
    

    I think this would also collapse duplicates in the other dimensions' names (if there were any).


    base R. You can use rowsum, but it's clunky here because it's designed for matrices

    res2 = array(, c(2, 10, 10), dimnames = lapply(dimnames(ex_array), unique))
    res2[] = sapply(seq_len(dim(ex_array)[3]), function(k) 
      rowsum(ex_array[,,k], rownames(ex_array[,,k])))