Search code examples
rfor-looprasterr-raster

how to add average rasters within for-loop that creates the rasters? R


I have several directories with 700+ binary encoded rasters that i take average the output rasters per directory. however, i currently create the rasters 1 by 1 in a for loop, then load newly created rasters back into R to take the sum to obtain the monthly rainfall total.

However, since I dont need the individual rasters, only the average raster, I have a hunch that I could do this all w/in 1 loop and not save the rasters but just the output average raster, but I am coming up short in how to program this in R.

setwd("~/Desktop/CMORPH/Levant-Clip/200001")

dir.output <- '~/Desktop/CMORPH/Levant-Clip/200001' ### change as needed to give output location
path <- list.files("~/Desktop/CMORPH/MonthlyCMORPH/200001",pattern="*.bz2", full.names=T, recursive=T)

for (i in 1:length(path)) {
  files = bzfile(path[i], "rb")
  data <- readBin(files,what="double",endian = "little", n = 4948*1649, size=4) #Mode of the vector to be read
  data[data == -999] <- NA #covert missing data from -999(CMORPH notation) to NAs
  y<-matrix((data=data), ncol=1649, nrow=4948)
  r <- raster(y)
  e <- extent(-180, 180, -90, 83.6236) ### choose the extent based on the netcdf file info 
  tr <- t(r) #transpose 
  re <- setExtent(tr,extent(e)) ### set the extent to the raster
  ry <- flip(re, direction = 'y')
  projection(ry) <- "+proj=longlat +datum=WGS84 +ellps=WGS84"
  C_Lev <- crop(ry, Levant) ### Clip to Levant
  M_C_Lev<-mask(C_Lev, Levant)
  writeRaster(M_C_Lev, paste(dir.output, basename(path[i]), sep = ''), format = 'GTiff', overwrite = T) ###the basename allows the file to be named the same as the original
}
# 
raspath <- list.files ('~/Desktop/CMORPH/Levant-Clip/200001',pattern="*.tif",     full.names=T, recursive=T)
rasstk <- stack(raspath)
sum200001<-sum(rasstk)
writeRaster(avg200001, paste(dir.output, basename(path[i]), sep = ''), format = 'GTiff', overwrite = T) ###the basename allows the file to be named the same as the original

currently, this code takes about 75 mins to execute, and I have about 120 more directories to go, and am looking for faster solutions.

thank you for all and any comments and input. best, evan


Solution

  • Elaborating on my previous comment, you could try:

    setwd("~/Desktop/CMORPH/Levant-Clip/200001")
    
    dir.output <- '~/Desktop/CMORPH/Levant-Clip/200001' ### change as needed to give output location
    path <- list.files("~/Desktop/CMORPH/MonthlyCMORPH/200001",pattern="*.bz2", full.names=T, recursive=T)
    raster_list = list()
    for (i in 1:length(path)) {
      files = bzfile(path[i], "rb")
      data <- readBin(files,what="double",endian = "little", n = 4948*1649, size=4) #Mode of the vector to be read
      data[data == -999] <- NA #covert missing data from -999(CMORPH notation) to NAs
      y<-matrix((data=data), ncol=1649, nrow=4948)
      r <- raster(y)
      if (i == 1) {
        e <- extent(-180, 180, -90, 83.6236) ### choose the extent based on the netcdf file info 
    
      }
      tr <- t(r) #transpose 
      re <- setExtent(tr,extent(e)) ### set the extent to the raster
      ry <- flip(re, direction = 'y')
      projection(ry) <- "+proj=longlat +datum=WGS84 +ellps=WGS84"
      C_Lev <- crop(ry, Levant) ### Clip to Levant
      M_C_Lev<-mask(C_Lev, Levant)
      raster_list[[i]] = M_C_Lev
    }
    # 
    
    rasstk <- stack(raster_list, quick = TRUE) # OR rasstk <- brick(raster_list, quick = TRUE)
    avg200001<-mean(rasstk)
    writeRaster(avg200001, paste(dir.output, basename(path[i]), sep = ''), format = 'GTiff', overwrite = T) ###the basename allows the file to be named the same as the original
    

    Using the "quick" options in stack should definitely speed-up things, in particular if you have many rasters.

    Another possibility is to first compute the average, and then perform the "spatial proceesing". For example:

    for (i in 1:length(path)) {
      files = bzfile(path[i], "rb")
      data <- readBin(files,what="double",endian = "little", n = 4948*1649, size=4) #Mode of the vector to be read
      data[data == -999] <- NA #covert missing data from -999(CMORPH notation) to NAs
    
      if (i == 1) {
       totdata  <-  data 
       num_nonNA <- as.numeric(!is.na(data))
      } else {
    totdata = rowSums(cbind(totdata,data), na.rm = TRUE)
    # We have to count the number of "valid" entries so that the average is correct !
    num_nonNA = rowSums(cbind(num_nonNA,as.numeric(!is.na(data))),na.rm = TRUE)
      }
    }
    
    avg_data = totdata/num_nonNA # Compute the average
    
    # Now do the "spatial" processing
    
    y<-matrix(avg_data, ncol=1649, nrow=4948)
    r <- raster(y)
    e <- extent(-180, 180, -90, 83.6236) ### choose the extent based on the netcdf file info 
    tr <- t(r) #transpose 
    re <- setExtent(tr,extent(e)) ### set the extent to the raster
    ry <- flip(re, direction = 'y')
    projection(ry) <- "+proj=longlat +datum=WGS84 +ellps=WGS84"
    C_Lev <- crop(avg_data, Levant) ### Clip to Levant
    M_C_Lev<-mask(C_Lev, Levant)
    writeRaster(M_C_Lev, paste(dir.output, basename(path[i]), sep = ''), format = 'GTiff', overwrite = T) ###the basename allows the file to be named the same as the original
    

    This could be faster or slower, depending from "how much" you are cropping the original data.

    HTH,

    Lorenzo