Search code examples
pythonmatplotlibweighted-averagehistogram2d

Compute mean value per pixel using weighted 2d histogram


I am using pyplot.hist2d to plot a 2D histogram (x vs.y) weighted by a third variable, z. Instead of summing z values in a given pixel [x_i,y_i] as done by hist2d, I'd like to obtain the average z of all data points falling in that pixel.

Is there a python script doing that ?

Thanks.


Solution

  • Numpy's histogram2d() can calculate both the counts (a standard histogram) as the sums (via the weights parameter). Dividing both gives the mean value.

    The example below shows the 3 histograms together with a colorbar. The number of samples is chosen relatively small to demonstrate what would happen for cells with a count of zero (the division gives NaN, so the cell is left blank).

    import numpy as np
    import matplotlib.pyplot as plt
    
    N = 1000
    x = np.random.uniform(0, 10, N)
    y = np.random.uniform(0, 10, N)
    z = np.cos(x) * np.sin(y)
    
    counts, xbins, ybins = np.histogram2d(x, y, bins=(30, 20))
    sums, _, _ = np.histogram2d(x, y, weights=z, bins=(xbins, ybins))
    
    fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize=(15, 4))
    m1 = ax1.pcolormesh(ybins, xbins, counts, cmap='coolwarm')
    plt.colorbar(m1, ax=ax1)
    ax1.set_title('counts')
    m2 = ax2.pcolormesh(ybins, xbins, sums, cmap='coolwarm')
    plt.colorbar(m2, ax=ax2)
    ax2.set_title('sums')
    with np.errstate(divide='ignore', invalid='ignore'):  # suppress possible divide-by-zero warnings
        m3 = ax3.pcolormesh(ybins, xbins, sums / counts, cmap='coolwarm')
    plt.colorbar(m3, ax=ax3)
    ax3.set_title('mean values')
    plt.tight_layout()
    plt.show()
    

    example plot