very new to Geopandas, I'm loading in a shapefile into a GeoDataFrame - it's a shapefile of the US states with a Polygon or Multipolygon for each. I'd like to calculate the minimum distance (in km or miles) between each pair of states - i.e. the closest distance from the CA polygon and the TX polygon for example (with 0 or a very small number for bordering states).
sample code of loading the gdf from shapefile (shapefile projection is GCS_North_American_1983):
import geopandas as gpd
from shapely.geometry import Point
shapefile = 'states.shp'
shape_gdf = gpd.read_file(shapefile)
The following example shows how you can calculate the distances, using the boroughs in ny as sample dataset:
import geopandas as gpd
import pandas as pd
# path = "states.shp"
path = gpd.datasets.get_path("nybb")
gdf = gpd.read_file(path)
# Geopandas needs a projected crs to get sensible distances. "USA Contiguous Equidistant Conic" is a reasonable option for USA states.
# gdf = gdf.to_crs("ESRI:102005")
result_gdf = None
for row in gdf.itertuples():
other_gdf = gdf[gdf.index < row.Index]
distances = other_gdf.geometry.distance(row.geometry)
distances_gdf = other_gdf.copy()
distances_gdf["distance"] = distances
for name, value in row._asdict().items():
distances_gdf[f"to_{name}"] = value
if result_gdf is None:
result_gdf = distances_gdf
else:
result_gdf = pd.concat([result_gdf, distances_gdf])
print(result_gdf[["BoroName", "to_BoroName", "distance"]].to_string(index=False))
Result:
BoroName to_BoroName distance
Staten Island Queens 30636.103288
Staten Island Brooklyn 4895.344637
Queens Brooklyn 0.000000
Staten Island Manhattan 17455.990534
Queens Manhattan 0.000000
Brooklyn Manhattan 0.000000
Staten Island Bronx 69951.707811
Queens Bronx 0.000000
Brooklyn Bronx 23611.105964
Manhattan Bronx 0.000000