Search code examples
rarraysdiagonal

Get all diagonal vectors from 3D array in R


My R question is very similar to Get all diagonal vectors from matrix, which has a particularly elegant answer (https://stackoverflow.com/a/27935808/2449926):

A <- matrix(1:16, 4)
d <- row(A) - col(A)
split(A, d)

The trick used here is that, in a 2D matrix, the differences between row indices and column indices are equal along diagonals. This property is leveraged to identify elements that belong to the same diagonal.

Is there any similarly elegant trick to identifying which 3D array elements lie on the same 3D diagonal? Take, for example, the following array:

arr <- array(1:60, dim = c(5, 4, 3))
arr
#> , , 1
#> 
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    6   11   16
#> [2,]    2    7   12   17
#> [3,]    3    8   13   18
#> [4,]    4    9   14   19
#> [5,]    5   10   15   20
#> 
#> , , 2
#> 
#>      [,1] [,2] [,3] [,4]
#> [1,]   21   26   31   36
#> [2,]   22   27   32   37
#> [3,]   23   28   33   38
#> [4,]   24   29   34   39
#> [5,]   25   30   35   40
#> 
#> , , 3
#> 
#>      [,1] [,2] [,3] [,4]
#> [1,]   41   46   51   56
#> [2,]   42   47   52   57
#> [3,]   43   48   53   58
#> [4,]   44   49   54   59
#> [5,]   45   50   55   60

Created on 2024-09-07 with reprex v2.1.1

The input should be the array only and the output solution will be ALL the diagonals in the 3D array, probably as a list, without me needing to specify their starting coordinates. For example, among the exactly 36 valid diagonals, a few precise example results would be:

  • Diagonal starting from position [1, 1, 1]: c(1, 27, 53)
  • Diagonal starting from position [3, 2, 1]: c(8, 34, 60)
  • Diagonal starting from position [1, 3, 1]: c(11, 37)
  • Diagonal starting from position [1, 2, 2]: c(26, 52)

I could certainly iterate through the array to calculate these diagonals. Specifically, I could iterate through all diagonals that start with rows and columns in depth 1 and then those starting with row 1 of depths 2 and 3. I would add 1 to each row, column, depth index until I terminate in depth 3.

However, this brute-force iteration would not be so efficient. I am looking for an efficient, elegant trick similar to the answer I cited above for 2D matrices. But I cannot think of a simple arithmetic operation on row, column, and depth indices that will result in diagonals consistently having the same value.

I hope my question is clear. I would appreciate any help.


Solution

  • Calculate the slice index of each dimension (3d counterpart to row and col for matrices) and perform the split shown:

    arr <- array(1:60, 5:3)
    
    si <- lapply(1:3, slice.index, x = arr)
    diags3 <- split(arr, list(si[[1]] - si[[2]], si[[2]] - si[[3]]), drop = TRUE)
    str(diags3)
    

    giving (continued after output)

    List of 36
     $ 0.-2 : int 41
     $ 1.-2 : int 42
     $ 2.-2 : int 43
     $ 3.-2 : int 44
     $ 4.-2 : int 45
     $ -1.-1: int 46
     $ 0.-1 : int [1:2] 21 47
     $ 1.-1 : int [1:2] 22 48
     $ 2.-1 : int [1:2] 23 49
     $ 3.-1 : int [1:2] 24 50
     $ 4.-1 : int 25
     $ -2.0 : int 51
     $ -1.0 : int [1:2] 26 52
     $ 0.0  : int [1:3] 1 27 53
     $ 1.0  : int [1:3] 2 28 54
     $ 2.0  : int [1:3] 3 29 55
     $ 3.0  : int [1:2] 4 30
     $ 4.0  : int 5
     $ -3.1 : int 56
     $ -2.1 : int [1:2] 31 57
     $ -1.1 : int [1:3] 6 32 58
     $ 0.1  : int [1:3] 7 33 59
     $ 1.1  : int [1:3] 8 34 60
     $ 2.1  : int [1:2] 9 35
     $ 3.1  : int 10
     $ -3.2 : int 36
     $ -2.2 : int [1:2] 11 37
     $ -1.2 : int [1:2] 12 38
     $ 0.2  : int [1:2] 13 39
     $ 1.2  : int [1:2] 14 40
     $ 2.2  : int 15
     $ -3.3 : int 16
     $ -2.3 : int 17
     $ -1.3 : int 18
     $ 0.3  : int 19
     $ 1.3  : int 20
    

    We verify that this gives the same diagonals as (2) below.

    setdiff(diags3, diags2)
    ## named list()
    
    setdiff(diags2, diags3)
    ## named list()
    

    Motivation. The matrix formula in the question's linked-to post uses the difference between row and col. The generalization to n dimensions in R is slice.index so we looked at whether differences in slice dimensions were constant across diagonals from (2). This shows they are for si[[1]] - si[[2]] and the same for si[[2]] - si[[3]]. We don't need si[[1]] - si[[3]] if we already have those. This suggested using those differences to form the groups. In the code below g, gr and gc are from (2) below and si is from above.

    is.constant <- function(x) diff(range(x)) == 0
    
    all(tapply(si[[1]] - si[[2]], paste(g, gr, gc), is.constant))
    ## [1] TRUE
    
    all(tapply(si[[2]] - si[[3]], paste(g, gr, gc), is.constant))
    ## [1] TRUE
    

    A variation of the solution above is based on noticing that if

    g <- expand.grid(1:5, 1:4, 1:3)
    

    then c(si[[i]]) equals g[[i]] for i = 1, 2, 3 so another solution is

    g <- expand.grid(lapply(dim(arr), seq_len));
    split(arr, list(g[[1]] - g[[2]], g[[2]] - g[[3]]), drop = TRUE)
    

    Old

    1) For a 3d array each diagonal element's position must be delta apart from the position of the next element in the same diagonal; thus, we can group diagonals using the following.

    nr <- nrow(arr); nc <- ncol(arr)
    delta <- nr * nc + nr + 1  # terms get to next layer, col & row
    g <- rep(1:delta, length = length(arr))
    diags <- split(arr, g)
    

    giving the following:

    str(diags)
    ## List of 26
    ##  $ 1 : int [1:3] 1 27 53
    ##  $ 2 : int [1:3] 2 28 54
    ##  $ 3 : int [1:3] 3 29 55
    ##  $ 4 : int [1:3] 4 30 56
    ##  $ 5 : int [1:3] 5 31 57
    ##  $ 6 : int [1:3] 6 32 58
    ##  $ 7 : int [1:3] 7 33 59
    ##  $ 8 : int [1:3] 8 34 60
    ##  $ 9 : int [1:2] 9 35
    ##  $ 10: int [1:2] 10 36
    ##  $ 11: int [1:2] 11 37
    ##  $ 12: int [1:2] 12 38
    ##  $ 13: int [1:2] 13 39
    ##  $ 14: int [1:2] 14 40
    ##  $ 15: int [1:2] 15 41
    ##  $ 16: int [1:2] 16 42
    ##  $ 17: int [1:2] 17 43
    ##  $ 18: int [1:2] 18 44
    ##  $ 19: int [1:2] 19 45
    ##  $ 20: int [1:2] 20 46
    ##  $ 21: int [1:2] 21 47
    ##  $ 22: int [1:2] 22 48
    ##  $ 23: int [1:2] 23 49
    ##  $ 24: int [1:2] 24 50
    ##  $ 25: int [1:2] 25 51
    ##  $ 26: int [1:2] 26 52
    

    2) This is similar but uses a different definition of diagonal so the answer will be different. It avoids the wrap around that occurs with (1). It proceeds as in (1) but then splits any group from (1) in which a subsequent element has a smaller row number to avoid wraparound. It handles column numbers similarly. The ## lines are from (1).

    nr <- nrow(arr); nc <- ncol(arr) ##
    delta <- nr * nc + nr + 1 ##
    g <- rep(1:delta, length = length(arr)) ##
    
    cumneg <- function(x) cumsum(c(TRUE, diff(x) < 0))
    gr <- ave(slice.index(arr, 1), g, FUN = cumneg)
    gc <- ave(slice.index(arr, 2), g, FUN = cumneg)
    
    diags2 <- split(arr, paste(g, gr, gc))
    str(diags2)
    

    giving (continued after output):

    ## List of 36
    ##  $ 1 1 1 : int [1:3] 1 27 53
    ##  $ 10 1 1: int 10
    ##  $ 10 2 1: int 36
    ##  $ 11 1 1: int [1:2] 11 37
    ##  $ 12 1 1: int [1:2] 12 38
    ##  $ 13 1 1: int [1:2] 13 39
    ##  $ 14 1 1: int [1:2] 14 40
    ##  $ 15 1 1: int 15
    ##  $ 15 2 2: int 41
    ##  $ 16 1 1: int 16
    ##  $ 16 1 2: int 42
    ##  $ 17 1 1: int 17
    ##  $ 17 1 2: int 43
    ##  $ 18 1 1: int 18
    ##  $ 18 1 2: int 44
    ##  $ 19 1 1: int 19
    ##  $ 19 1 2: int 45
    ##  $ 2 1 1 : int [1:3] 2 28 54
    ##  $ 20 1 1: int 20
    ##  $ 20 2 2: int 46
    ##  $ 21 1 1: int [1:2] 21 47
    ##  $ 22 1 1: int [1:2] 22 48
    ##  $ 23 1 1: int [1:2] 23 49
    ##  $ 24 1 1: int [1:2] 24 50
    ##  $ 25 1 1: int 25
    ##  $ 25 2 1: int 51
    ##  $ 26 1 1: int [1:2] 26 52
    ##  $ 3 1 1 : int [1:3] 3 29 55
    ##  $ 4 1 1 : int [1:2] 4 30
    ##  $ 4 2 1 : int 56
    ##  $ 5 1 1 : int 5
    ##  $ 5 2 1 : int [1:2] 31 57
    ##  $ 6 1 1 : int [1:3] 6 32 58
    ##  $ 7 1 1 : int [1:3] 7 33 59
    ##  $ 8 1 1 : int [1:3] 8 34 60
    ##  $ 9 1 1 : int [1:2] 9 35
    

    We note that (2) gives the same result (except for order) as the diagonals in the question's linked-to post when appropriately modified for a matrix suggesting that (2) is the approach looked for. (Note that slice.index(m, 1) can be written as row(m) for a matrix m.)

    m <- arr[,,1]
    nr <- nrow(m)
    delta <- nr + 1
    g <- rep(1:delta, length = length(m)) ##
    
    cumneg <- function(x) cumsum(c(TRUE, diff(x) < 0))
    gr <- ave(row(m), g, FUN = cumneg)
    diags.m <- split(m, paste(g, gr))
    
    str(diags.m)
    ## List of 8
    ##  $ 1 1: int [1:4] 1 7 13 19
    ##  $ 2 1: int [1:4] 2 8 14 20
    ##  $ 3 1: int [1:3] 3 9 15
    ##  $ 4 1: int [1:2] 4 10
    ##  $ 4 2: int 16
    ##  $ 5 1: int 5
    ##  $ 5 2: int [1:2] 11 17
    ##  $ 6 1: int [1:3] 6 12 18
    
    str(split(m, row(m) - col(m)))  # based on question's linked-to post
    ## List of 8
    ##  $ -3: int 16
    ##  $ -2: int [1:2] 11 17
    ##  $ -1: int [1:3] 6 12 18
    ##  $ 0 : int [1:4] 1 7 13 19
    ##  $ 1 : int [1:4] 2 8 14 20
    ##  $ 2 : int [1:3] 3 9 15
    ##  $ 3 : int [1:2] 4 10
    

    Update

    Added (2). Fixed. Found better approach. (1) and (2) are now under Old.