I am using the geojson file from the [OpenData Vancouver][1] website and I am trying to find the zoning classifications that fall within 5 kms of a "Historical Area".
So, I am buffering all historical areas by 5kms (My data is projected), performing the intersect operation and using the intersect results as an index:
buffs = gdf_26910[gdf_26910['zoning_classification']=='Historical Area']['geometry'].buffer(5000)
gdf_26910[buffs.intersects(gdf_26910['geometry'])]
However, this is the output I am getting:
zoning_category zoning_classification zoning_district object_id geometry area centroid buffer5K
87 HA Historical Area HA-1A 78541 POLYGON ((492805.516 5458679.305, 492805.038 5... 3848.384041 POINT (492778.785 5458699.947) POLYGON ((497803.807 5458548.605, 497803.124 5...
111 HA Historical Area HA-3 78640 POLYGON ((491358.402 5458065.050, 491309.735 5... 66336.339719 POINT (491183.139 5458103.162) POLYGON ((492818.045 5453267.595, 492421.697 5...
180 HA Historical Area HA-1A 78836 POLYGON ((492925.194 5458575.204, 492929.600 5... 90566.768532 POINT (492753.969 5458456.804) POLYGON ((487583.872 5458086.263, 487564.746 5...
683 HA Historical Area HA-1 78779 POLYGON ((492925.194 5458575.204, 492802.702 5... 69052.427940 POINT (492606.372 5458621.753) POLYGON ((487874.100 5456398.633, 487789.801 5...
1208 HA Historical Area HA-2 78833 POLYGON ((492332.139 5458699.308, 492346.989 5... 179805.027166 POINT (492343.437 5458944.412) POLYGON ((489822.136 5454379.453, 489755.087 5...
Clearly, I am getting a match for the Historical Areas and not all the other geometries that intersect the buffers.
I have plotted the buffers and the output looks correct:
#Plot
base=gdf_26910.plot()
buffs.plot(ax=base, color='red', alpha=0.25)
[![enter image description here][2]][2]
I have also opened the data in QGIS and verified that there are 5 'Historical Areas' and they are all adjacent to 'Comprehensive Development'. So, the matching rows after the intersect operation should be "Comprehensive Development" at the least.
Where am I going wrong?
Two core points
estimate_utm_crs()
for projection. Have also use cap_style
and join_style
for a more reflective buffered polygon.sjoin()
instead of mask approach in your code. This will effectively give duplicates, so de-dupe using pandas groupby().first()
predicate="within"
and used folium to visualise (possibly helps you understand how geometry is working)import geopandas as gpd
import folium
gdf_26910 = gpd.read_file(
"https://opendata.vancouver.ca/explore/dataset/zoning-districts-and-labels/download/?format=geojson&timezone=Europe/London&lang=en"
)
buffs = gdf_26910.loc[gdf_26910["zoning_classification"] == "Historical Area"]
# buffer is defined as km, so need a CRS in meters...
buffs = (
buffs.to_crs(buffs.estimate_utm_crs())
.buffer(5000, cap_style=2, join_style=3)
.to_crs(gdf_ha.crs)
)
# this warns so is clearly bad !
# gdf_26910[buffs.intersects(gdf_26910['geometry'])]
# some geometries intersect multiple historical areas, take first intersection from sjoin()
gdf_5km = (
gdf_26910.reset_index()
.sjoin(buffs.to_frame(), predicate="within")
.groupby("index")
.first()
.set_crs(gdf_26910.crs)
)
m = buffs.explore(name="buffer")
gdf_5km.explore("zoning_classification", m=m, name="within")
gdf_26910.explore("zoning_classification", m=m, name="all", legend=False)
folium.LayerControl().add_to(m)
m