Search code examples
rextractrastershapefile

Error when trying to extract values from a raster layer to each shapefile using R


I'm trying to extract the values from a raster layer to a shapefile layer, but the process is extremely time consuming, but 2 hours without me getting any result. In general considering the size of the polygons and the raster this process should not take more than two 2 minutes in cases that I had already done.

the message I get is this: no loop for break/next, jumping to top level

my code is this:

library(sf)
library(raster)

#shapefile geometry
shp <- sf::st_read("/vsicurl/http://wesleysc352.github.io/seg_s3_r3_m10_fix_estat_amost_val.shp")

#raster geotiff
raster3<-raster::raster("http://wesleysc352.github.io/1final.tif")

#extract values
ext1<-raster::extract(raster3, shp)

OBS Edit: links for reproducible example:

shape https://github.com/wesleysc352/wesleysc352.github.io/raw/master/seg_s3_r3_m10_fix_estat_amost_val.zip

raster https://github.com/wesleysc352/wesleysc352.github.io/raw/master/1final.tif


Solution

  • Use exact_extract from the exactextractr package. It takes 5 seconds:

    > system.time({ext1<-exact_extract(raster3, shp)})
      |======================================================================| 100%
       user  system elapsed 
      4.805   0.296   5.103 
    

    and gives you a list of data frames of pixel numbers and fractional coverage:

    > head(ext1[[1]])
      value coverage_fraction
    1     4              0.25
    2     4              0.50
    3     4              0.50
    4     4              0.25
    5     4              0.25
    6     4              0.75
    > head(ext1[[2]])
      value coverage_fraction
    1     4              0.25
    2     4              0.50
    3     4              0.25
    4     4              0.50
    5     4              1.00
    6     4              0.50
    [etc]
    

    which is more than extract gives you (a list of pixel values only).

    Why is it slow? Don't know, can't really see the point finding out when exactextractr exists. If you want to know, try subsets of your polygons. For example the first ten rows takes:

    > system.time({ext10<-extract(raster3, shp[1:10,])})
       user  system elapsed 
      2.549   0.029   2.574 
    

    2.57 seconds, which extrapolates to:

    > nrow(shp)
    [1] 2308
    > 2.5*2308/10
    [1] 577
    

    577 seconds for the lot. If the full set is still going after ten minutes I'd split it in parts and see if maybe there's an odd polygon with bad geometry that's messing it all up. But in that time you could have run exact_extract about 200 times and played a round of golf/gone fishing/engaged in pleasant leisure activity of your choice.