I need to identify the shortest distance between points, across two dataframes.
Dataframe biz
contains individual businesses, including their coordinates:
biz <- structure(list(name = c("bizA", "bizB", "bizC", "bizD"),
lon = c(-3.276435,-4.175388,-4.181740,-3.821941),
lat = c(11.96748,12.19885,13.04638,11.84277)),
class = "data.frame",row.names = c(NA, -4L))
biz
name lon lat
1 bizA -3.276435 11.96748
2 bizB -4.175388 12.19885
3 bizC -4.181740 13.04638
4 bizD -3.821941 11.84277
Dataframe city
contains market cities, including their geocoordinates:
city <- structure(list(name = c("cityA", "cityB", "cityC", "cityD"),
lon = c(-4.7588042,-3.2432781,-3.0626284,-2.3566861),
lat = c(10.64002,10.95790,13.06950,13.20363)),
class = "data.frame",row.names = c(NA, -4L))
city
name lon lat
1 cityA -4.758804 10.64002
2 cityB -3.243278 10.95790
3 cityC -3.062628 13.06950
4 cityD -2.356686 13.20363
For each business in biz
, I need to identify which market city is closest, and list the name of that market city in a new column:
biz
name lon lat city
1 bizA -3.276435 11.96748
2 bizB -4.175388 12.19885
3 bizC -4.181740 13.04638
4 bizD -3.821941 11.84277
I know that I can use packages like geosphere
to measure the distance between bizA
and cityA
coordinates. I'm struggling with: how to compare bizA
to each city, minimize the distance, and then list that closest city in dataframe biz
.
Any thoughts are much appreciated!
You can use st_nearest_feature
from sf
:
cbind(
biz,
nearest_city = city[
st_nearest_feature(
st_as_sf(biz, coords = c("lon", "lat"), crs = 4326),
st_as_sf(city, coords = c("lon", "lat"), crs = 4326)
),
]$name
)
although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar
name lon lat nearest_city
1 bizA -3.276435 11.96748 cityB
2 bizB -4.175388 12.19885 cityC
3 bizC -4.181740 13.04638 cityC
4 bizD -3.821941 11.84277 cityB