Streamlining binary rasterization in R

I have a few very small country-level polygon and point shapefiles that I would like to rasterize in R. The final product should be one global binary raster (indicating whether grid cell center is covered by a polygon / point lies within cell or not). My approach is to loop over the shapefiles and do the following for each shapefile:

  # load shapefile  
  shp       = sf::read_sf(shapefile_path)
  
  # create a global raster template with resolution 0.0083
  ext       = extent(-180.0042, 180.0042, -65.00417, 75.00417)
  gridsize  = 0.008333333
  r         = raster(ext, res = gridsize)
  
  # rasterize polygon or point shapefile to raster
  rr        = rasterize(shp, r, background = 0) #all grid cells that are not covered get 0

  # convert to binary raster
  values(rr)[values(rr)>0] = 1

Here, rr is the raster file where the polygons / points in shp are coded as 1 and all other grid cells are coded as 0. Afterwards, I take the sum over all rr to arrive at one global binary raster file including all polygons / points.

The final two steps are incredibly slow. In addition, I get RAM problems when I try to replace the all positive values in rr with 1 as the cell count is very large due to the fine resolution. I was wondering whether it is possible to come up with a smarter solution for what I'd like to achieve.

I have already found the fasterize package that has a speedy implementation of rasterize which works fine. I think it would be of great help if someone has a solution where rasterize directly returns a binary raster.

Solution

This is how you can do this better with raster. Note the value=1 argument, and also that that I changed your specification of the extent -- as what you do is probably not correct.

library(raster)
v <- shapefile(shapefile_path)
ext <- extent(-180, 180, -65, 75)
r <- raster(ext, res = 1/120)
rr <- rasterize(v, r, value=1, background = 0)

There is no need for your last step, but you could have done

rr <- clamp(rr, 0, 1)
# or 
rr <- rr > 0
# or
rr <- reclassify(rr, cbind(1, Inf, 1))

raster::calc is not very efficient for simple arithmetic like this

It should be much faster to rasterize all vector data in one step, rather than in a loop, especially with large rasters like this (for which the program may need to write a temp file for each iteration).

To illustrate this solution with example data

library(raster)
cds1 <- rbind(c(-180,-20), c(-140,55), c(10, 0), c(-140,-60))
cds2 <- rbind(c(-10,0), c(140,60), c(160,0), c(140,-55))
cds3 <- rbind(c(-125,0), c(0,60), c(40,5), c(15,-45))
v <- spLines(cds1, cds2, cds3)   
r <- raster(ncols=90, nrows=45)

r <- rasterize(v, r, field=1)

To speed things up, you can use terra (the replacement for raster)

library(raster)
f <- system.file("ex/lux.shp", package="terra")
v <- as.lines(vect(f))
r <- rast(v, ncol=75, nrow=100)
x <- rasterize(v, r, field=1)