I want to create polygons inside an apply
and want to do this as quickly as possible from a matrix of coordinates. I have some code and realized this is one of the slowest parts of my code. How can I do this efficiently? I tried two different approaches:
Approach 1
library(sp)
library(terra)
t0 <- Sys.time()
poly_list <- apply(matrix(1:10000), 1, function(idx){
# set coordinates
coords <- cbind(rnorm(100), rnorm(100))
# create polygon
Polygons(list(Polygon(coords)), idx)
})
# convert to terra polygons
poly_terra <- vect(SpatialPolygons(poly_list))
# show time passed
print(Sys.time() - t0)
# Time difference of 2.082166 secs
Approach 2
t0 <- Sys.time()
poly_list <- apply(matrix(1:10000), 1, function(idx){
# set coordinates
coords <- cbind(rnorm(100), rnorm(100))
# create polygon
vect(coords, type = "polygon")
})
# convert to terra polygons
poly_terra <- vect(poly_list)
print(Sys.time() - t0)
# Time difference of 16.38044 secs
Why is it faster to create sp
polygons and convert them afterwards than directly creating terra
polygons? The code with vect(SpatialPolygons(Polygons(list(Polygon(coords)), idx)))
seems somewhat complicated. Is there a faster or at least more elegant way?
Edit Currently my fastest option, although it feels illegal:
t0 <- Sys.time()
dummy <- Polygons(list(Polygon(cbind(rep(0,4), rep(0,4)))), "0")
poly_list <- apply(matrix(1:10000), 1, function(idx){
# set coordinates
coords <- cbind(rnorm(100), rnorm(100))
# create polygon
new <- dummy
new@ID <- as.character(idx)
new@Polygons[[1]]@coords <- coords
return(new)
})
# convert to terra polygons
poly_terra <- vect(SpatialPolygons(poly_list))
print(Sys.time() - t0)
# Time difference of 0.7147191 secs
This is faster than your examples
t0 <- Sys.time()
p <- lapply(1:10000, function(idx){
cbind(id=idx, x=rnorm(100), y=rnorm(100))
})
p <- do.call(rbind, p)
v <- vect(p, "polygons")
print(Sys.time() - t0)
#Time difference of 0.483578 secs
This uses lapply
and you state that you want to use apply
; but in the context of your example apply
does not seem to be a good choice.
I do not see much performance difference between your two sp approaches. Below I use a streamlined version of the one you say is fastest and benchmark it with my approach:
with_terra <- function() {
p <- lapply(1:10000, function(idx){
cbind(id=idx, x=rnorm(100), y=rnorm(100))
})
p <- do.call(rbind, p)
vect(p, "polygons")
}
with_sp <- function() {
dummy <- Polygons(list(Polygon(cbind(rep(0,4), rep(0,4)))), "0")
poly_list <- apply(matrix(1:10000), 1, function(idx){
dummy@ID <- as.character(idx)
dummy@Polygons[[1]]@coords <- cbind(rnorm(100), rnorm(100))
dummy
})
vect(SpatialPolygons(poly_list))
}
bm <- microbenchmark::microbenchmark(
sp = with_sp(),
terra = with_terra(),
times = 10
)
bm
#Unit: milliseconds
# expr min lq mean median uq max neval
# sp 836.8434 892.8411 930.5261 935.3788 968.2724 1039.2840 10
# terra 261.2191 276.0770 298.3603 282.7462 296.3674 437.0505 10