Search code examples
rmapsbufferr-sfr-leaflet

Calculate average distance between all points in radius and reference point


I have spatial data and would like to create a new column in my dataframe that calculates the average distance to neighboring coordinates within a radius of my choice.

I found this code that gets me kind of close:

library(sf)
library(dplyr)
library(leaflet)

# three semi random cities in NC
cities <- data.frame(name = c("Raleigh", "Greensboro", "Wilmington"),
                     x = c(-78.633333, -79.819444, -77.912222),
                     y = c(35.766667, 36.08, 34.223333)) %>% 
  sf::st_as_sf(coords = c("x", "y"), crs = 4326)

#Buffer of 2 arc degrees around Greensboro
buffer <- sf::st_buffer(cities[2, ], 200000)

# Draw Leaflet map
leaflet() %>% 
  addProviderTiles("OpenStreetMap") %>% 
  addPolygons(data = buffer) %>% 
  addCircleMarkers(data = cities, 
                   color = "red",
                   fillOpacity = 2/3,
                   stroke = F)

# Count # of coordinates in radius
count <- sf::st_join(cities, buffer, left = F) %>% 
  nrow()

I want to be able to modify this code with the adjustable radius and say, when I used X as the radius, the average distance between Greensboro and whatever other points are captured in the circle is = 15.4 miles.

Can someone help point me in the right direction of how to solve this problem?


Solution

  • This can be most easily done using the sfdep package.

    The idea is to create a neighbor matrix using a distance band. Then, with those neighbors, find the distances from the focal point. Lastly, iterate over the resultant list column to calculate the average.

    library(sf)
    library(sfdep)
    library(dplyr)
    library(leaflet)
    
    # three semi random cities in NC
    cities <- data.frame(name = c("Raleigh", "Greensboro", "Wilmington"),
                         x = c(-78.633333, -79.819444, -77.912222),
                         y = c(35.766667, 36.08, 34.223333)) %>% 
      as_tibble() |> 
      sf::st_as_sf(coords = c("x", "y"), crs = 4326)
    
    
    cities |> 
      mutate(
        nb = st_dist_band(geometry),
        dists = st_nb_dists(geometry, nb),
        avg_dist = purrr::map_dbl(dists, mean)
      ) 
    #> Simple feature collection with 3 features and 4 fields
    #> Geometry type: POINT
    #> Dimension:     XY
    #> Bounding box:  xmin: -79.81944 ymin: 34.22333 xmax: -77.91222 ymax: 36.08
    #> Geodetic CRS:  WGS 84
    #> # A tibble: 3 × 5
    #>   name                   geometry nb        dists     avg_dist
    #> * <chr>               <POINT [°]> <nb>      <list>       <dbl>
    #> 1 Raleigh    (-78.63333 35.76667) <int [2]> <dbl [2]>     148.
    #> 2 Greensboro    (-79.81944 36.08) <int [1]> <dbl [1]>     112.
    #> 3 Wilmington (-77.91222 34.22333) <int [1]> <dbl [1]>     184.