Search code examples
rgisgeospatialr-sfr-sp

How to do spatial join of polygons to nearest point using polygon boundry?


I have polygons (polygon.shp). I want to match the nearest points to the polygons. The link to the file LINK

I am trying to achieve the following task:

  1. Calculate the distance to the nearest point from the boundary of the polygon (NOT from the centroid of polygons)
  2. For some polygons I have to match more than one nearest point (see "no_matches" attribute)

In the final output I am planning to achieve the following information: Column: id, no_matches, point_id_1, distance_1, point_id_2, distance_2, point_id_3, distance_3

setwd("~/example")
library(sf)
polygon <- st_read("polygon.shp")
point <- st_read("point.shp")

nearest = try(sf::st_nearest_feature(polygon, point))
ls = sf::st_nearest_points( polygon, point[nearest], pairwise = F)

I get the following error:

Error in `[.data.frame`(x, i) : undefined columns selected

Any idea how could I achieve this task?


Solution

  • Consider this piece of code; what it does is:

    • iterates over your polygons object, creating 1 row in results dataframe for each iteration
    • for a given polygon finds indices of nearest points; this will be, depending on the value of no_matches column, vector of length one to three
    • for each iteration find distance from your polygon to three points in the nearest vector; should the nearest vector have length less than 3 (as specified by no_matches column) NA will be returned.

    Given the need to find more than 1 nearest object (which I kind of missed in my previous answer, now deleted) you will be better off with nngeo::st_nn() than sf::st_nearest_feature(). Using sf::st_distance() for calculating distance remains.

    It is necessary to iterate (via a for cycle or some kind of apply) because the k argument of nngeo::st_nn() is not vectorized.

    library(sf)
    library(dplyr)
    
    points <- st_read("./example/point.shp")
    polygons <- st_read("./example/polygon.shp")
    
    result <- data.frame() # initiate empty result set
    
    for (i  in polygons$id) { # iterate over polygons
      
      # ids of nearest points as vector of lenght one to three
      nearest <- nngeo::st_nn(polygons[polygons$id == i, ], 
                              points, 
                              k = polygons[polygons$id == i, ]$no_matches,
                              progress = F)[[1]]
      
      # calculate result for current polygon
      current <- data.frame(id = i,
                            # id + distance to first nearest point
                            point_id_1 = points[nearest[1], ]$point_id,
                            distance_1 = st_distance(polygons[polygons$id == i, ],
                                                     points[nearest[1], ]),
                            # id + distance to second nearest point (or NA)
                            point_id_2 = points[nearest[2], ]$point_id,
                            distance_2 = st_distance(polygons[polygons$id == i, ],
                                                     points[nearest[2], ]),
                            # id + distance to third nearest point (or NA)
                            point_id_3 = points[nearest[3], ]$point_id,
                            distance_3 = st_distance(polygons[polygons$id == i, ],
                                                     points[nearest[3], ])
      )
      
      
      # add current result to "global" results data frame
      result <- result %>% 
        bind_rows(current)
      
    }
    
    # check results
    result