Search code examples
rgeometryr-sf

R: How to replace values in polygon geometry without a for loop?


I have over 60,000 building polygons footprints in shapefile format (polygonZ) where I need to replace the z values (for each ring) for all the polygon vertices, so that all the Z values for the polygon are the same. The issue is that we have 3D polygons that have not been edited in 3D so some rings have correct Z values for all the vertices while other rings have a few erroneous Z values which need to be corrected.

I am able to do it via a for loop but it takes a good 45 minutes to complete as it needs to iterate through all the polygons.

I was wondering if anyone has any ideas on how to speed this up?

I did try the following, thinking I could replace all the values at once via indexing but it didn't work. I assume its due to there being a different number of vertices per polygon compared to the "new values" I want to use as replacement.

> myshape$geometry[][[1]][[1]][[1]][,2] <- new_values[]

Error in myshape$geometry[][[1]][[1]][[1]][, 2] <- new_values[] : 
  number of items to replace is not a multiple of replacement length

Here is a working example using stock R data. I am replacing the X instead of Z for this example.

library(sf)
library(tictoc) 

# load test data
myshape <- st_read(system.file("shape/nc.shp", package = "sf"))

# recast to polygon (my actual data are polygons, not multipolygons)
myshape <- st_cast(myshape, "POLYGON")

# make some random data
new_values <- runif(nrow(myshape))

# replace some values
tic()
for (row in 1:nrow(myshape)) {
  # replace
  myshape$geometry[row][[1]][[1]][,2] <- new_values[row]
}
toc()

edit: clarified that each ring will have the same Z value, recast to polygons (my actual data are polygons), provided background on my problem and also fixed an error in my for loop.


Solution

  • There's something going on that maybe someone else can explain--possibly the whole geometry column is being overwritten with every iteration using the for loop. A Map approach is orders of magnitude faster with a larger sf object:

    library(sf)
    
    # load test data
    myshape <- st_read(system.file("shape/nc.shp", package = "sf"))[sample(100, 1e4, 1),]
    
    # make some random data
    new_values <- runif(1e4)
    myshape$geometry2 <- myshape$geometry3 <- myshape$geometry
    
    system.time({
      for (row in 1:nrow(myshape)) {
        # replace
        myshape$geometry2[row][[1]][[1]][[1]][,2] <- new_values[row]
      }
    })
    #>    user  system elapsed 
    #>  105.19    0.39  105.64
    
    system.time({
      f <- function(g, x) {
        g[[1]][[1]][,2] <- x
        g
      }
      
      myshape$geometry3[] <- Map(f, myshape$geometry, new_values)
    })
    #>    user  system elapsed 
    #>    0.07    0.00    0.06
    
    identical(myshape$geometry2, myshape$geometry3)
    #> [1] TRUE