I want to do a spatial binning (using median as aggregation function)
starting from a CSV file containing pollutant values measured at positions long and lat.
The resulting map should be something as:
But for data applied to a city's extent.
At this regard I found this tutorial that is close to what I want to do, but I was not able to get the desired result.
I think that I'm missing something on how to correctly use dissolve
and plot the resulting data (better using Folium
)
Any useful example code?
shapely.geometry.box()
aggfunc
to demonstrate multiple metrics can be calculatedimport geopandas as gpd
import shapely.geometry
import numpy as np
# equivalent of CSV, all earthquake points globally
gdf_e = gpd.read_file(
"https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.geojson"
)
# get geometry of bounding area. Have selected a state rather than a city
gdf_CA = gpd.read_file(
"https://raw.githubusercontent.com/glynnbird/usstatesgeojson/master/california.geojson"
).loc[:, ["geometry"]]
BOXES = 50
a, b, c, d = gdf_CA.total_bounds
# create a grid for Califormia, could be a city
gdf_grid = gpd.GeoDataFrame(
geometry=[
shapely.geometry.box(minx, miny, maxx, maxy)
for minx, maxx in zip(np.linspace(a, c, BOXES), np.linspace(a, c, BOXES)[1:])
for miny, maxy in zip(np.linspace(b, d, BOXES), np.linspace(b, d, BOXES)[1:])
],
crs="epsg:4326",
)
# remove grid boxes created outside actual geometry
gdf_grid = gdf_grid.sjoin(gdf_CA).drop(columns="index_right")
# get earthquakes that have occured within one of the grid geometries
gdf_e_CA = gdf_e.loc[:, ["geometry", "mag"]].sjoin(gdf_grid)
# get median magnitude of eargquakes in grid
gdf_grid = gdf_grid.join(
gdf_e_CA.dissolve(by="index_right", aggfunc="median").drop(columns="geometry")
)
# how many earthquakes in the grid
gdf_grid = gdf_grid.join(
gdf_e_CA.dissolve(by="index_right", aggfunc=lambda d: len(d))
.drop(columns="geometry")
.rename(columns={"mag": "number"})
)
# drop grids geometries that have no measures and create folium map
m = gdf_grid.dropna().explore(column="mag")
# for good measure - boundary on map too
gdf_CA["geometry"].apply(lambda g: shapely.geometry.MultiLineString([p.exterior for p in g.geoms])).explore(m=m)