I have a dataframe that contains thousands of points with geolocation (longitude, latitude) for Washington D.C. The following is a snippet of it:
import pandas as pd
df = pd.DataFrame({'lat': [ 38.897221,38.888100,38.915390,38.895100,38.895100,38.901005,38.960491,38.996342,38.915310,38.936820], 'lng': [-77.031048,-76.898480,-77.021380,-77.036700,-77.036700 ,-76.990784,-76.862907,-77.028131,-77.010403,-77.184930]})
If you plot the points in the map you can see that some of them are clearly within some buildings:
import folium
wash_map = folium.Map(location=[38.8977, -77.0365], zoom_start=10)
for index,location_info in df.iterrows():
folium.CircleMarker(
location=[location_info["lat"], location_info["lng"]], radius=5,
fill=True, fill_color='red',).add_to(wash_map)
wash_map.save('example_stack.html')
import webbrowser
import os
webbrowser.open('file://'+os.path.realpath('example_stack.html'), new=2)
My goal is to exclude all the points that are within buildings. For that, I first download bounding boxes for the city buildings and then try to exclude points within those polygons as follows:
import osmnx as ox
#brew install spatialindex this solves problems in mac
%matplotlib inline
ox.config(log_console=True)
ox.__version__
tags = {"building": True}
gdf = ox.geometries.geometries_from_point([38.8977, -77.0365], tags, dist=1000)
gdf.shape
For computational simplicity I have requested the shapes of all buildings around the White house with a radius of 1 km. On my own code I have tried with bigger radiuses to make sure all the buildings are included.
In order to exclude points within the polygons I developed the following function (which includes the shape obtention):
def buildings(df,center_point,dist):
import osmnx as ox
#brew install spatialindex this solves problems in mac
%matplotlib inline
ox.config(log_console=True)
ox.__version__
tags = {"building": True}
gdf = ox.geometries.geometries_from_point(center_point, tags,dist)
from shapely.geometry import Point,Polygon
# Next step is to put our coordinates in the correct shapely format: remember to run the map funciton before
#df['within_building']=[]
for point in range(len(df)):
if gdf.geometry.contains(Point(df.lat[point],df.lng[point])).all()==False:
df['within_building']=False
else :
df['within_building']=True
buildings(df,[38.8977, -77.0365],1000)
df['within_building'].all()==False
The function always returns that points are outside building shapes although you can clearly see in the map that some of them are within. I don't know how to plot the shapes over my map so I am not sure if my polygons are correct but for the coordinates they appear to be so. Any ideas?
The example points you provided don't seem to fall within those buildings' footprints. I don't know what your points' coordinate reference system is, so I guessed EPSG4326. But to answer your question, here's how you would exclude them, resulting in gdf_points_not_in_bldgs
:
import geopandas as gpd
import matplotlib.pyplot as plt
import osmnx as ox
import pandas as pd
# the coordinates you provided
df = pd.DataFrame({'lat': [38.897221,38.888100,38.915390,38.895100,38.895100,38.901005,38.960491,38.996342,38.915310,38.936820],
'lng': [-77.031048,-76.898480,-77.021380,-77.036700,-77.036700 ,-76.990784,-76.862907,-77.028131,-77.010403,-77.184930]})
# create GeoDataFrame of point geometries
geom = gpd.points_from_xy(df['lng'], df['lat'])
gdf_points = gpd.GeoDataFrame(geometry=geom, crs='epsg:4326')
# get building footprints
tags = {"building": True}
gdf_bldgs = ox.geometries_from_point([38.8977, -77.0365], tags, dist=1000)
gdf_bldgs.shape
# get all points that are not within a building footprint
mask = gdf_points.within(gdf_bldgs.unary_union)
gdf_points_not_in_bldgs = gdf_points[~mask]
print(gdf_points_not_in_bldgs.shape) # (10, 1)
# plot buildings and points
ax = gdf_bldgs.plot()
ax = gdf_points.plot(ax=ax, c='r')
plt.show()
# zoom in to see better
ax = gdf_bldgs.plot()
ax = gdf_points.plot(ax=ax, c='r')
ax.set_xlim(-77.04, -77.03)
ax.set_ylim(38.89, 38.90)
plt.show()