Search code examples
rapplypolygonterra

Efficiently create many polygons


I want to create polygons inside an apply and want to do this as quickly as possible from a matrix of coordinates. I have some code and realized this is one of the slowest parts of my code. How can I do this efficiently? I tried two different approaches:

Approach 1

library(sp)
library(terra)    
t0 <- Sys.time()

poly_list <- apply(matrix(1:10000), 1, function(idx){      
  # set coordinates
  coords <- cbind(rnorm(100), rnorm(100))      
  # create polygon
  Polygons(list(Polygon(coords)), idx)
})
  
# convert to terra polygons
poly_terra <- vect(SpatialPolygons(poly_list))

# show time passed
print(Sys.time() - t0)   
# Time difference of 2.082166 secs

Approach 2

t0 <- Sys.time()    
poly_list <- apply(matrix(1:10000), 1, function(idx){      
  # set coordinates
  coords <- cbind(rnorm(100), rnorm(100))      
  # create polygon
  vect(coords, type = "polygon")
})

# convert to terra polygons
poly_terra <- vect(poly_list)

print(Sys.time() - t0)
# Time difference of 16.38044 secs

Why is it faster to create sp polygons and convert them afterwards than directly creating terra polygons? The code with vect(SpatialPolygons(Polygons(list(Polygon(coords)), idx))) seems somewhat complicated. Is there a faster or at least more elegant way?

Edit Currently my fastest option, although it feels illegal:

t0 <- Sys.time()    
dummy <- Polygons(list(Polygon(cbind(rep(0,4), rep(0,4)))), "0")   
poly_list <- apply(matrix(1:10000), 1, function(idx){
  
  # set coordinates
  coords <- cbind(rnorm(100), rnorm(100))     
  # create polygon
  new <- dummy
  new@ID <- as.character(idx)
  new@Polygons[[1]]@coords <- coords
  return(new)
})

# convert to terra polygons
poly_terra <- vect(SpatialPolygons(poly_list))
print(Sys.time() - t0)
# Time difference of 0.7147191 secs

Solution

  • This is faster than your examples

    t0 <- Sys.time()    
    p <- lapply(1:10000, function(idx){
      cbind(id=idx, x=rnorm(100), y=rnorm(100))
    })
    p <- do.call(rbind, p)
    v <- vect(p, "polygons")
    print(Sys.time() - t0)
    #Time difference of 0.483578 secs
    

    This uses lapply and you state that you want to use apply; but in the context of your example apply does not seem to be a good choice.

    I do not see much performance difference between your two sp approaches. Below I use a streamlined version of the one you say is fastest and benchmark it with my approach:

    with_terra <- function() {
       p <- lapply(1:10000, function(idx){
          cbind(id=idx, x=rnorm(100), y=rnorm(100))
       })
       p <- do.call(rbind, p)
       vect(p, "polygons")
    }
    
    with_sp <- function() {
       dummy <- Polygons(list(Polygon(cbind(rep(0,4), rep(0,4)))), "0")   
       poly_list <- apply(matrix(1:10000), 1, function(idx){
         dummy@ID <- as.character(idx)
         dummy@Polygons[[1]]@coords <- cbind(rnorm(100), rnorm(100))
         dummy
       })
       vect(SpatialPolygons(poly_list))
    }
    
    bm <- microbenchmark::microbenchmark(
      sp = with_sp(),
      terra = with_terra(),
      times = 10
    )
    
    bm
    #Unit: milliseconds
    #  expr      min       lq     mean   median       uq       max neval
    #    sp 836.8434 892.8411 930.5261 935.3788 968.2724 1039.2840    10
    # terra 261.2191 276.0770 298.3603 282.7462 296.3674  437.0505    10