I need to aggregate a rasterbrick into monthly values. Normally, this would be easy by using zApply
function from raster
package. However, I have a large rasterbrick and this would take a very long time.
So basically, I am wondering if this would be easy to do it with some libraries like parallel
or clusterR
but I have no clue how to parallelize this process
# create a random raster stack
library(raster)
lay <- stack()
for (i in 1:365){
print(i)
ras <- matrix(rnorm(500, mean = 21, sd = rnorm(21, 12, 4)))
ras <- raster(ras)
lay <- addLayer(lay, ras)
}
dats <- seq(as.Date('2000-01-01'), length.out = nlayers(lay), by = 'days')
lay <- setZ(lay, dats)
monthlies <- zApply(lay, by = format(dats,"%m"), fun = 'mean') # aggregate from daily to monthly.
Thanks!
You can use foreach
and doParallel
to achieve your result.
You will need to:
detectCores()
DoParallel
to work with your CPU cores with registerDoParallel(numCores)
foreach
loop with the needed packages, any init variable, and a method to combine the results.Your code will look like this:
library(foreach)
library(doParallel)
library(raster)
lay <- stack()
## Loading required package: iterators
numCores <- detectCores()
registerDoParallel(numCores) # use multicore, set to the number of our cores
lay <- foreach (i=1:365, .init = lay, .combine = addLayer , .packages = "raster") %dopar% {
print(i)
ras <- matrix(rnorm(500, mean = 21, sd = rnorm(21, 12, 4)))
ras <- raster(ras)
}
dats <- seq(as.Date('2000-01-01'), length.out = nlayers(lay), by = 'days')
lay <- setZ(lay, dats)
monthlies <- zApply(lay, by = format(dats,"%m"), fun = 'mean') # aggregate from daily to monthly
# When you're done, clean up the cluster
stopImplicitCluster()
You can test the speed improvement using System.time()
. These are my results:
#Time with a standard for loop
system.time({
for (i in 1:365){
print(i)
ras <- matrix(rnorm(500, mean = 21, sd = rnorm(21, 12, 4)))
ras <- raster(ras)
lay <- addLayer(lay, ras)
}
})
user system elapsed
66.29 0.09 67.15
#Testing foreach loop time
system.time({
lay <- foreach (i=1:365, .init = lay, .combine = addLayer , .packages = "raster") %dopar% {
print(i)
ras <- matrix(rnorm(500, mean = 21, sd = rnorm(21, 12, 4)))
ras <- raster(ras)
}
})
user system elapsed
21.72 0.09 25.58
As we can see, there was an effective speed improvement by using this method.
Hope this helps.