I have a raster brick (ncell=28536 and nlayers=181). I need to run mathematical functions on the original brick and create two more bricks of same size. Where both output bricks are dependent on each other.
inputBrick has 181 layers and 28536 cells per layer. outputBrick1 will calculate values of its 1st layer by analyzing outputBrick2's 1st layer. Then outputBrick2 will calculate values of its 2nd layer by analyzing outputBricks1's 1st layer and so on.
I created a function that works fine with 24 cells and 181 layers. But takes forever for 28000 cells and 181 layers. I know I shouldn't be using for loops for this but as I'm not a programmer I'm struggling.
Here is some example data for a much smaller dataset. There are 3 RasterBricks. Input has values while both outputs are empty
library(raster)
b <- brick(ncols=5, nrows=5, nl=5)
inBrick <- setValues(b, runif(ncell(b) * nlayers(b)))
inBrick[c(1,2,3,22,23,24,25)] <- NA
outBrick1 <- inBrick
outBrick1[] <- NA
outBrick2 <- outBrick1
ini <- 0.3
p <- 0.15
p1 <- p/3
p2 <- p-(p/3)
fc <- 0.3
var1 <- which(!is.na(inBrick[[1]][]))
outBrick2[[1]][var1] <- ini
### now outBrick2 has initial values in 1st layer
weather <- c(0.1, 0, 0, 0, 0.3)
Calculations that I want to do and have no idea how to do it efficiently without using for loops
var3 <- 1:ncell(inBrick)
### outBrick1 Calculations
for (i in 1:nlayers(inBrick)) {
varr1 <- inBrick[[i]][]*(((outBrick2[[i]][]-p1)/(p2))^2)
for (j in 1:ncell(inBrick)) {
if(!is.na(outBrick2[[i]][j])){
if(outBrick2[[i]][j]>p){
outBrick1[[i]][j] <- inBrick[[i]][j]
}else{
outBrick1[[i]][j] <- varr1[j]
}
}
}
###outBrick2 Calculations
for (k in 2:nlayers(inBrick)) {
var2 <- outBrick2[[k-1]][] + (weather[k-1]-outBrick1[[k-1]][])/100
for(l in 1:ncell(inBrick)){
var3[l] <- min(fc, var2[l])
}
outBrick2[[k]][] <- var3
}
}
Now, I want to basically understand what the best approach is to deal with situations like this. I tried increasing memory too by following commands
rasterOptions(maxmemory = 5.17e+10)
rasterOptions(memfrac = 0.8)
rasterOptions(chunksize = 5.17e+10)
but when I see CPU and RAM usage its barely 6% and 10% respectively. R uses only 5% CPU and 1GB RAM. My system has 64GB RAM, 16GB GPU.
Here is an attempt. This is much more concise. It is only three times faster on this example, but the gain may be larger on your real data.
library(terra)
b <- r1 <- r2 <- rast(ncols=5, nrows=5, nl=5, vals=NA)
set.seed(0)
values(b) <- runif(size(b))
b[c(1,2,3,22,23,24,25)] <- NA
p <- 0.15
p1 <- p/3
p2 <- p-(p/3)
fc <- 0.3
weather <- c(0.1, 0, 0, 0, 0.3)
r2[[1]] <- ifel(is.na(b[[1]]), NA, 0.3)
for (i in 1:nlyr(b)) {
varr1 <- b[[i]] * (((r2[[i]] - p1)/p2)^2)
r1[[i]] <- ifel(r2[[i]] > p, b[[i]], varr1)
for (k in 2:nlyr(b)) {
r2[[k]] <- min(r2[[k-1]] + (weather[k-1] - r1[[k-1]]) /100, fc)
}
}
If speed is the issue, and the raster data can be loaded into RAM, this may be much faster:
db <- values(b)
dr1 <- values(r1)
dr2 <- values(r2)
for (i in 1:ncol(db)) {
varr1 <- db[, i] * (((dr2[, i] - p1)/p2)^2)
dr1[,i] <- ifelse(dr2[, i] > p, db[, i], varr1)
for (k in 2:ncol(b)) {
dr2[,k] <- pmin(dr2[, k-1] + (weather[k-1] - dr1[, k-1]) /100, fc)
}
}
values(r1) <- dr1
values(r2) <- dr2