Search code examples
rdataframeindices

Summing nearby elements of a matrix in R


In R, I'm trying to make a simple function like the one below, just summing the elements in the row of a data frame which are k positions away the (i,j) element. If the element is on the edge (e.g. j=1 or j=n) I'd like for the element to the left or right which doesn't exist to be treated as 0. But with my current function I end up with an error if the element to the right doesn't exist, or a vector if the one on the left doesn't exist due to R's behavior with negative indices. Is there a nicer way to write this function without just using if statements to deal with the three cases (element is in the middle, too far left, or too far right)?

sum_nearby <- function(dat, i, j, k) {
dat[i, j - k] + dat[i, j + k]
}

Solution

  • You can do

    sum_nearby <- function(dat, i, j, k) {
      left <- max(1, j - k)
      right <- min(j + k, ncol(dat))
      dat[i, left] + dat[i, right]
      }
    

    This means that close to the boundary, the k-neighbourhood is not symmetric.

    Let's consider a simplified case / example with a vector:

    f <- function (x, j, k) {
      left <- max(1, j - k)
      right <- min(j + k, length(x))
      x[left] + x[right]
      }
    

    Say

    x <- seq(2, 10, by = 2)
    # [1] 2 4 6 8 10
    

    Let's test the summation effect for all elements with k = 2:

    sapply(1:5, f, k = 2, x = x)
    # [1]  8 10 12 14 16
    
    • The first 8 is actually x[1] + x[3], instead of x[-1] + x[3].
    • The second 10 is x[1] + x[4], rather than x[0] + x[4].

    If you simply want to ignore those "out-of-bound" values, use an if:

    sum_nearby <- function(dat, i, j, k) {
      if (j - k < 0) dat[i, j + k]
      else if (j + k > ncol(dat)) dat[i, j - k]
      else dat[i, j + k] + dat[i, j - k]
      }