Search code examples
pythonsmoothinghistogram2d

How to smooth or overlap bins in pyplot.hist2d?


I am plotting a 2D histogram to show, for example, the concentration of lightnings (given by their position registered in longitude and latitude). The number of data points is not too large (53) and the result is too coarse. Here is a picture of the result:

For this reason, I am trying to find a way to weight in data from surrounding bins. For example, there is a bin at longitude = 130 and latitude = 34.395 with 0 lightning registered, but with several around it. I would want this bin to reflect somehow the concentration around it. In other words, I want to smooth the data by having overlapping bins (so that a data point can be counted more than once, by different contiguous bins).

I understand that hist2d has the input option for "weights", but this would only work to make a data point more "important" within its bin.

The simplified code is below and I can clarify anything needed.

import numpy as np
import matplotlib.pyplot as plt

# Here are the data, to experiment if needed
longitude = np.array([119.165, 115.828, 110.354, 117.124, 119.16 , 107.068, 108.628, 126.914, 125.685, 116.608, 122.455, 116.278, 123.43, 128.84, 128.603, 130.192, 124.508, 121.916, 133.245, 125.088, 126.641, 127.224, 113.686, 129.376, 127.312, 121.353, 117.834, 125.219, 138.077, 153.299, 135.66 , 128.391, 118.011, 117.313, 119.986, 118.619, 119.178, 120.295, 121.991, 123.519, 135.948, 132.224, 129.317, 135.334, 132.923, 129.828, 139.006, 140.813, 116.207, 139.254, 120.922, 112.171, 143.508])

latitude = np.array([34.381, 34.351, 34.359, 34.357, 34.364, 34.339, 34.351, 34.38, 34.381, 34.366, 34.373, 34.366, 34.369, 34.387, 34.39 , 34.39 , 34.386, 34.371, 34.394, 34.386, 34.384, 34.387, 34.369, 34.4  , 34.396, 34.37 , 34.374, 34.383, 34.403, 34.429, 34.405, 34.385, 34.367, 34.36 , 34.367, 34.364, 34.363, 34.367, 34.367, 34.369, 34.399, 34.396, 34.382, 34.401, 34.396, 34.392, 34.401, 34.401, 34.362, 34.404, 34.382, 34.346, 34.406])

# Number of bins 
Nbins = 15
# Plot histogram of the positions
plt.hist2d(longitude,latitude, bins=Nbins)
plt.plot(longitude,latitude,'o',markersize = 8, color = 'k')
plt.plot(longitude,latitude,'o',markersize = 6, color = 'w')
plt.colorbar()
plt.show()

Solution

  • Perhaps you're getting confused with the concept of 2D-histogram, or histogram. Besides the fact a histogram is a bar plot groupping data into plot, it is also a dicretized estimation of a probability funtion. In your case, the presence probability. For this reason, I would not try to overlap histograms.

    Moreover, because the histogram is 'discrete', it will be necessarily coarse. Actually, the resolution of a histogram is an important parameter regarding the desired visualization.

    Going back to your question, if you want to disminish the coarse effect, you may to simply want to play on Nbins.

    Perhaps, other graph type would suit better your usage: see this gallery and the 2D-density plot with shading.