Search code examples
rrasterterra

R raster error [writeStart] file exists. You can use 'overwrite=TRUE' to overwrite it


I have 36 csv files with lat, lon and value for random points in the world. Each file is quite big so it's not practical to merge them in a single file.

What I want to do is:

  1. Create a blank raster covering the entire globe
  2. Loop through each csv and fill the blank raster with the value
  3. If a cell in blank raster is not present in csv, simply assign a value of 1 to it.

Here's my approach

library(terra)

# create a blank raster 
r <- terra::rast(ncols=129600, 
                 nrows=64800, 
                 xmin=-180, 
                 xmax=180, 
                 ymin=-90, 
                 ymax=90,
                 resolution = 0.002777778,
                 crs="+proj=longlat +datum=WGS84")


# get list of csv in folder 
file_list <- list.files(getwd())
 
# loop through each csv
for(f in seq_along(file_list)){
   
  file_ref <- file_list[f]
   
  temp <- read.csv(file_ref)
  v <- terra::vect(temp, geom = c("lon", "lat"), crs = "+proj=longlat +datum=WGS84") # convert csv to point
    
  terra::rasterize(x = v, y = r, field = "value", background = 1, 
                  filename = file.path(getwd(), 'mask.tif'),
                  overwrite = FALSE) 
  rm(temp, v)        
}
 

At the 10th iteration, I get the following error:

Error: [writeStart] file exists. You can use 'overwrite=TRUE' to overwrite it
 

I am not able to understand the error and how to fix it?


Solution

  • Ok, since you don't want to keep the load in memory, seems like you have to write your data to disk. I really am no expert in this, but let's try. I want to know if this works.

    First of all, I'd get rid of your static filename in rasterize():

    library(terra)
    
    # create a blank raster 
    r <- rast(ncols = 129600, 
              nrows = 64800, 
              xmin = -180, 
              xmax = 180, 
              ymin = -90, 
              ymax = 90,
              resolution = 0.002777778,
              crs = "+proj=longlat +datum=WGS84")
    
    
    # get list of csv in folder 
    file_list <- list.files(pattern = "*.csv")
     
    # loop through each csv
    for(f in seq_along(file_list)){
       
      # read csv 
      temp <- read.csv(file_list[f])
    
      # convert data.frame to SpatVect
      v <- vect(temp, geom = c("lon", "lat"), crs = "+proj=longlat +datum=WGS84") 
      
      # burn values to raster and write tif to disk 
      rasterize(x = v, y = r, field = "value", background = 1, 
                filename = paste0("mask_", f, ".tif"),
                overwrite = FALSE)        
    }
    

    Now you should have 36 tif files written to disk. You can re-import them at once making use of rast(). What you get, is a SpatRast object with 36 layers:

    r_list <- list.files(pattern = "*.tif")
    
    r_stack <- rast(r_list)
    
    # the result should look approximately like this:
    
    r_stack
    #> class       : SpatRaster 
    #> dimensions  : 64800, 129600, 36  (nrow, ncol, nlyr)
    #> resolution  : 0.002777778, 0.002777778(x, y)
    #> extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
    #> coord. ref. : lon/lat WGS 84 (EPSG:4326) 
    #> sources     : p1_mask.tif  
    #>               p2_mask.tif  
    #>               p3_mask.tif  
    #> names       :        lyr.1,        lyr.1,        lyr.1 
    #> min values  : 0.0008870518, 0.0001774965, 0.0022119265 
    #> max values  :            1,            1,            2 
    

    You can reduce the dimensions to 1 by e.g. making use of e.g. min() - max() and mean() would probably not make very much sense here if your values are < 1 as I learned in your other question and your background value is 1 - and write the result to disk:

    min(r_stack) |> writeRaster("full_mask.tif")
    

    This should probably work quite well as long as min() suits your needs. If you need another aggregation function, you probably will have to get rid of rasterize(..., background = 1) and use writeRaster(..., NAflag = 1) instead. Eventually consider adjusting the fun argument in rasterize() also, because this is one of the main parts where aggregation is taking place.