For each cell of a rasterStack, I want to find the most recent layer where the value exceeds a fixed threshold. Layers are stacked in chronological order, so this corresponds to the maximum index. Ultimately I want to know 1) the year of that layer (taken from the layer name), and 2) the value in that year.
I wrote a function to do this and was getting the wrong result. I modified the function in a way that I did not think would change it; I now get the correct result. My question is why these functions produce something different.
Set up the example:
library(raster)
library(rasterVis)
### Example raster stack:
set.seed(123123)
r1 <- raster(nrows = 10, ncols = 10)
r2 <- r3 <- r4 <- r1
r1[] <- rbinom(ncell(r1), 1, prob = .1)
r2[] <- rbinom(ncell(r1), 1, prob = .1)
r3[] <- rbinom(ncell(r1), 1, prob = .1)
r4[] <- rbinom(ncell(r1), 1, prob = .1)
rs <- stack(r1, r2, r3, r4)
names(rs) <- paste0("yr", 1:4)
These are the functions I would have thought would be the same...is there a reason not to have a vector as an intermediate step?
### Function to find index of last event:
# v1; this is wrong
findLast <- function(x) {
fire.ind <- ifelse(any(x > 0), which(x > 0), NA)
max.ind <- max(fire.ind)
}
# v2; this one gives correct answer
findLast2 <- function(x) {
max.ind <- ifelse(any(x > 0), max(which(x > 0)), NA)
}
testFind <- calc(rs, findLast)
freq(testFind)
testFind2 <- calc(rs, findLast2)
all.equal(testFind, testFind2)
Show the example inputs and the different results:
# plot:
s2 = stack(rs, testFind, testFind2)
levelplot(s2, pretty = TRUE)
Code to get the final layers I want:
### Most recent year:
nameFromInd <- function(x) {
yr <- as.integer(gsub(".*(\\d.*).*", "\\1", names(rs)[x]))
}
testYr <- calc(testFind2, nameFromInd)
### Value in most recent year:
testYrValue <- stackSelect(rs, testFind2)
Any insight into what I'm not seeing here? I haven't played around with alternatives that enhance speed, but any suggestions are welcome as I'll be doing this on a very large dataset.
sessionInfo()
R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
ifelse
can be a bit surprising in that it returns a value with the same "shape" as the first argument, which has length one. So it returns the first value only of the result of which(x > 0)
. Always check the functions you use before using them with calc
.
x <- c(-1:2, 1:-1)
ifelse(any(x > 0), which(x > 0), NA)
#[1] 3
ifelse(any(x > 0), max(which(x > 0)), NA)
#[1] 5
ifelse
is a pretty involved function, and I think you can avoid it by doing:
r <- calc(rs, function(x) max(which(x > 0)))
y <- stackSelect(rs, r)
(and ignore the warnings). Or suppress them:
options(warn=-1)
r <- calc(rs, function(x) max(which(x > 0)))
options(warn=0)
testYrValue <- stackSelect(rs, r)
You can also combine what you do with calc
and stackSelect
like this
f1 <- function(x) {
if (any(x>0)) {
i <- max(which(x > 0))
cbind(i, x[i])
} else {
cbind(NA, NA)
}
}
rr1 <- calc(rs, f1)
Or the shortcut variant:
f2 <- function(x) {
i <- max(which(x > 0))
cbind(i, x[i])
}
rr2 <- calc(rs, f2)
Or
f3 <- function(x) {
i <- max(which(x > 0))
z <- cbind(i, x[i])
z[!is.finite(z)] <- NA
z
}
rr3 <- calc(rs, f3)