Search code examples
rmatchingspatialspatial-index

Spatial matching of big datasets


I have a dataset with about 100000 points and another dataset with roughly 3000 polygons. For each of the points I need to find the nearest polygon (spatial match). Points inside a polygon should match to that polygon.

Computing all-pairs distances is feasible, but takes a bit longer than necessary. Is there an R package that will make use of a spatial index for this kind of matching problem?

I am aware of the sp package and the over function, but the documentation doesn't tell anything about indexes.


Solution

  • You could try and use the gDistance function in the rgeos package for this. As an example look at the below example, which I reworked from this old thread. Hope it helps.

    require( rgeos )
    require( sp )
    
    # Make some polygons
    grd <- GridTopology(c(1,1), c(1,1), c(10,10))
    polys <- as.SpatialPolygons.GridTopology(grd)
    
    # Make some points and label with letter ID
    set.seed( 1091 )
    pts = matrix( runif( 20 , 1 , 10 ) , ncol = 2 )
    sp_pts <- SpatialPoints( pts )
    row.names(pts) <- letters[1:10]
    
    # Plot
    plot( polys )
    text( pts , labels = row.names( pts ) , col = 2 , cex = 2 )
    text( coordinates(polys) , labels = row.names( polys ) , col = "#313131" , cex = 0.75 )
    

    enter image description here

    # Find which polygon each point is nearest
    cbind( row.names( pts ) , apply( gDistance( sp_pts , polys , byid = TRUE ) , 2 , which.min ) )
    #   [,1] [,2]
    #1  "a"  "86"
    #2  "b"  "54"
    #3  "c"  "12"
    #4  "d"  "13"
    #5  "e"  "78"
    #6  "f"  "25"
    #7  "g"  "36"
    #8  "h"  "62"
    #9  "i"  "40"
    #10 "j"  "55"