Search code examples
pythonmatplotlibvisualizationgeopandasgeoplot

Issues Using geoplot.kdeplot with geopandas


Example of a kdeplot from the geopandas docs

Im trying to make a kdeplot using geopandas.

this is my code:

Downloading shape file

URL = "https://data.sfgov.org/api/geospatial/wkhw-cjsf?method=export&format=Shapefile"
response = requests.get(URL)
open('pd_data.zip', 'wb').write(response.content)

with zipfile.ZipFile('./pd_data.zip', 'r') as zip_ref:
    zip_ref.extractall('./ShapeFiles')

Making the geopandas data frame

data = train.groupby(['PdDistrict']).count().iloc[:,0]
data = pd.DataFrame({ "district": data.index,
                    "incidences": data.values})

california_map = str(list(pathlib.Path('./ShapeFiles').glob('*.shp'))[0])

gdf = gdp.read_file(california_map)
gdf = pd.merge(gdf, data, on = 'district')

Note: I didn't include the link to the train set because it's not important for this question(use any data you want)

This is the part that I don't get, what arguments should I pass to the kdeplot function, like where I pass the shape file and where I pass the data?

ax = gplt.kdeplot(
    data, clip=gdf.geometry,
    shade=True, cmap='Reds',
    projection=gplt.crs.AlbersEqualArea())
gplt.polyplot(boroughs, ax=ax, zorder=1)

Solution

    • had a few challenges setting up an environment where I did not get kernel crashes. Used none wheel versions of shapely and pygeos
    • a few things covered in documentation kdeplot A basic kdeplot takes pointwise data as input. You did not provide sample for data I'm not sure that it is point wise data. Have simulated point wise data, 100 points within each of the districts in referenced geometry
    • I have found I cannot use clip and projection parameters together. One or the other not both
    • shape file is passed to clip
    import geopandas as gpd
    import pandas as pd
    import numpy as np
    import geoplot as gplt
    import geoplot.crs as gcrs
    
    # setup starting point to match question
    url = "https://data.sfgov.org/api/geospatial/wkhw-cjsf?method=export&format=Shapefile"
    gdf = gpd.read_file(url)
    
    # generate 100 points in each of the districts
    r = np.random.RandomState(42)
    N = 5000
    data = pd.concat(
        [
            gpd.GeoSeries(
                gpd.points_from_xy(*[r.uniform(*g.bounds[s::2], N) for s in (0, 1)]),
                crs=gdf.crs,
            ).loc[lambda s: s.intersects(g.buffer(-0.003))]
            for _, g in gdf["geometry"].iteritems()
        ]
    )
    data = (
        gpd.GeoDataFrame(geometry=data)
        .sjoin(gdf)
        .groupby("district")
        .sample(100, random_state=42)
        .reset_index(drop=True)
    )
    
    ax = gplt.kdeplot(
        data,
        clip=gdf,
        fill=True,
        cmap="Reds",
        # projection=gplt.crs.AlbersEqualArea(),
    )
    gplt.polyplot(gdf, ax=ax, zorder=1)
    

    enter image description here