Search code examples
rparallel-processingraster

Does clusterApply in R affect calculation of average for each layer of raster brick?


I have raster bricks for daily minimum and maximum temperatures from Daymet. The geographic extent is continental U.S. Each brick has 365 layers, one for each day of the year. I want to calculate the mean daily temperature from pair of min. and max. bricks by year. It seems the overlay() function in the R raster package does what I want. For example:

library(raster)
tmin <- brick("tmin_2018.nc")
tmax <- brick("tmax_2018.nc")
tmean <- overlay(tmin, tmax, fun=mean)

The thing is, it takes a long time and a lot of RAM. More RAM than I have. I have access to high performance computing cluster. I have code for this process on the cluster that utilizes the OpenMPI utility and can provide parallel processing. My test case only uses 5 layers. I request 6 nodes from the cluster for the job.

library(iterators)
library(foreach)
library(snow)
library(parallel)
library(Rmpi)
library(compiler)
library(raster)
library(rgdal)
library(ncdf4)
print(mpi.universe.size())

nras <- brick("infil/ras_in/daymet_v3_tmax_2009_na.nc4")
xras <- brick("infil/ras_in/daymet_v3_tmin_2009_na.nc4")

cl <- makeCluster( mpi.universe.size() - 1, type='MPI' )

clusterExport(cl, "nras")
clusterExport(cl, "xras")
clusterExport(cl, c('overlay', 'writeRaster', 'brick'))

easy_fn <- function(i) {
        overlay(nras[[i]], xras[[i]], fun = mean)
        }

out <- clusterApply(cl, 1:5, fun=easy_fn)
outbrickb <- brick(out)
writeRaster(outbrickb, "outfil/rasout/testbrick.tif", wopt=list(datatype="FLT4S", filetype="GTiff"), overwrite=T)

And that also works. But the values are different. When I calculate the mean for one layer (day), with the first code chunk above I get values between -55.255 and 31.08. After running small number of layers as a test on HPC cluster with the 2nd code chunk I get values between -50 and 44.5. If things were off by decimal places, I wouldn't be concerned, but the values are far different. Is there something I'm missing in how clusterApply() or my little easy_fn() work vs. just calling overlay() that will lead to different values?


Solution

  • It may be easiest and faster to use terra instead of raster

    Example data

    library(terra)
    tmin <- rast(system.file("ex/logo.tif", package="terra")) / 10
    tmax <- tmin + 10
    

    Solution

    tavg1 <- mean(tmin, tmax)
    

    Note that this works in terra because it computes the parallel mean (by layer), whereas raster would return the mean of all layers. To get that with terra you would do

    m <- mean(c(tmin, tmax))
    

    The equivalent to overlay in raster is terra::lapp, and one might need it for more complex cases.

    tavg2 <- lapp(sds(tmin, tmax), function(x,y) (x + y)/2)
    
    all(values(tavg1) == values(tavg2))
    #[1] TRUE
    

    Now with raster

    Example data

    library(raster)
    tmin <- brick(tmin)
    tmax <- brick(tmax)
    

    Using an algebraic approach could be faster than overlay

    tavg3 <- (tmin + tmax) / 2
    tavg4 <- overlay(tmin, tmax, fun = mean)
    

    With your function

    easy_fn <- function(i) {
        overlay(tmin[[i]], tmax[[i]], fun = mean)
    }
    
    x <- lapply(1:nlayers(tmin), easy_fn)
    tavg5 <- stack(x)
    

    And now in parallel

    library(parallel)
    cl <- makeCluster(getOption("cl.cores", 2))
    clusterExport(cl, c('overlay', 'tmin', 'tmax'))
    out <- clusterApply(cl, 1:nlayers(tmin), fun=easy_fn)
    stopCluster(cl)
    tavg6 <- stack(out)
    
    all(values(tavg1) == values(tavg3))
    #[1] TRUE
    all(values(tavg1) == values(tavg4))
    #[1] TRUE
    all(values(tavg1) == values(tavg5))
    #[1] TRUE
    all(values(tavg1) == values(tavg6))
    #[1] TRUE
    

    All results are the same.

    Now with your larger datasets, and timing the different approaches

    fmx <- "daymet_v4_daily_hi_tmax_2015.nc"
    fmn <- "daymet_v4_daily_hi_tmin_2015.nc"
    library(terra)
    tmin <- rast(fmn)
    tmax <- rast(fmx)
    system.time(tavg1 <- mean(tmin, tmax))
    #   user  system elapsed 
    #   1.88    0.75    2.62 
    
    system.time(tavg2 <- lapp(sds(tmin, tmax), function(x,y) (x + y)/2))
    #   user  system elapsed 
    #   2.34    1.40    3.76 
    

    These are clearly the fastest approaches (see below). You can add a filename argument to lapp, but with mean you would have to also use writeRaster

    And with raster

    library(raster)
    tmin <- brick(fmn)
    tmax <- brick(fmx)
    system.time(tavg3 <- (tmin + tmax) / 2)
    #   user  system elapsed 
    #  13.94    4.42   18.47 
    

    In this case overlay is extraordinarily slow

    system.time(tavg4 <- overlay(tmin, tmax, fun = mean))
    #  user  system elapsed 
    # 478.06    5.78  488.16 
    
    system.time(x <- lapply(1:nlayers(tmin), easy_fn))
    #   user  system elapsed 
    # 474.03    5.14  482.42 
    tavg5 <- stack(x)
    

    It does go faster with parallelization; but not enough to make it worth pursuing

    library(parallel)
    cl <- makeCluster(getOption("cl.cores", 2))
    clusterExport(cl, c('overlay', 'tmin', 'tmax'))
    system.time(out <- clusterApply(cl, 1:nlayers(tmin), fun=easy_fn))
    #   user  system elapsed 
    #   0.53    0.78  184.82 
    stopCluster(cl)
    tavg6 <- stack(out)
    

    I do not see a meaningful difference between the sequential and parallelized approaches

    d <- tavg6 - tavg4
    d
    #class      : RasterBrick 
    #dimensions : 584, 284, 165856, 365  (nrow, ncol, ncell, nlayers)
    #resolution : 1000, 1000  (x, y)
    #extent     : -5802750, -5518750, -622500, -38500  (xmin, xmax, ymin, ymax)
    #crs        : +proj=lcc +lat_0=42.5 +lon_0=-100 +lat_1=25 +lat_2=60 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs 
    #source     : r_tmp_2021-11-20_103534_54328_41274.grd 
    #names      :       layer.1,       layer.2,       layer.3,       layer.4,       layer.5,       layer.6,       layer.7,       layer.8,       layer.9,      layer.10,      layer.11,      layer.12,      layer.13,      layer.14,      layer.15, ... 
    #min values : -9.536743e-07, -9.536743e-07, -9.536743e-07, -9.536743e-07, -9.536743e-07, -9.536743e-07, -9.536743e-07, -9.536743e-07, -9.536743e-07, -9.536743e-07, -9.536743e-07, -9.536743e-07, -9.536743e-07, -9.536743e-07, -9.536743e-07, ... 
    #max values :  9.536743e-07,  9.536743e-07,  9.536743e-07,  9.536743e-07,  9.536743e-07,  9.536743e-07,  9.536743e-07,  9.536743e-07,  9.536743e-07,  9.536743e-07,  9.536743e-07,  9.536743e-07,  9.536743e-07,  9.536743e-07,  9.536743e-07, ...