In R, I'm trying to make a simple function like the one below, just summing the elements in the row of a data frame which are k positions away the (i,j) element. If the element is on the edge (e.g. j=1 or j=n) I'd like for the element to the left or right which doesn't exist to be treated as 0. But with my current function I end up with an error if the element to the right doesn't exist, or a vector if the one on the left doesn't exist due to R's behavior with negative indices. Is there a nicer way to write this function without just using if statements to deal with the three cases (element is in the middle, too far left, or too far right)?
sum_nearby <- function(dat, i, j, k) {
dat[i, j - k] + dat[i, j + k]
}
You can do
sum_nearby <- function(dat, i, j, k) {
left <- max(1, j - k)
right <- min(j + k, ncol(dat))
dat[i, left] + dat[i, right]
}
This means that close to the boundary, the k-neighbourhood is not symmetric.
Let's consider a simplified case / example with a vector:
f <- function (x, j, k) {
left <- max(1, j - k)
right <- min(j + k, length(x))
x[left] + x[right]
}
Say
x <- seq(2, 10, by = 2)
# [1] 2 4 6 8 10
Let's test the summation effect for all elements with k = 2
:
sapply(1:5, f, k = 2, x = x)
# [1] 8 10 12 14 16
8
is actually x[1] + x[3]
, instead of x[-1] + x[3]
.10
is x[1] + x[4]
, rather than x[0] + x[4]
.If you simply want to ignore those "out-of-bound" values, use an if
:
sum_nearby <- function(dat, i, j, k) {
if (j - k < 0) dat[i, j + k]
else if (j + k > ncol(dat)) dat[i, j - k]
else dat[i, j + k] + dat[i, j - k]
}