Search code examples
matplotlibpolygonarea

obtain hexagon areas from a matplotlib.pyplot.hexbin plot


I am trying to plot mass surface density using hexbin because my input is a dataframe of thousands of points with [lon, lat, mass]

hexbin is fast and I want to generate a surface density map (in this case kg of mass per m2 of hexagon) so I use the following call to hexbin:


# rounded number of hexagons in x to have a hexagon widths of aprox 100m
nx = round(111000 * lenght_x / 100)

hb = df.plot.hexbin(
        x="lon",
        y="lat",
        C="mass",
        gridsize=nx,
        cmap="viridis",
        mincnt=3,
        ax=ax,
        reduce_C_function=np.sum,
    )

I use reduce_C_fuinction to get the sum of the mass and now I want to divide by the hexagon area but I do not find a way to calculate the exact area of the irregular hexagons that can be generated.

I only be able to obtain the centers of the hexagons following this:

    # Get hexagon centers
    pollycollection = hb.get_children()[0]
    centers = pollycollection.get_offsets()
    x_c = [p[0] for p in centers]
    y_c = [p[1] for p in centers]
    plt.plot(x_c, y_c, "x-", color="red")

Does anyone know any way to obtain the exact area?

Thank you in advance!


Solution

  • IIUC, you could make a Polygon from the hexbin vertices, then normalize it using the area :

    import matplotlib.pyplot as plt
    
    fig, ax = plt.subplots(figsize=(6, 4))
    
    hb = ax.hexbin(
        x="lon",
        y="lat",
        C="mass",
        data=df,
        gridsize=10,
        cmap="viridis",
        mincnt=3,
        reduce_C_function=np.sum,
    )
    
    from shapely import Polygon
    import matplotlib.colors as mcolors
    
    hb_array_n = hb.get_array() / Polygon(hb.get_paths()[0].vertices).area
    norm = mcolors.Normalize(vmin=hb_array_n.min(), vmax=hb_array_n.max())
    
    hb.set_norm(norm)
    hb.set_array(hb_array_n)
    
    cb = plt.colorbar(hb, label="Surface Density")  # colorbar=True/pandas
    

    enter image description here

    Used input :

    import numpy as np
    import pandas as pd
    
    np.random.seed(0)
    
    N = 10_000
    
    df = pd.DataFrame(
        {
            "lon": np.random.uniform(-180, 180, N),
            "lat": np.random.uniform(-90, 90, N),
            "mass": np.random.exponential(100, N),
        }
    )