Search code examples
pythonmatplotlibhistogram

Scaled logarithmic binning in python


I'm interested in plotting the probability distribution of a set of points which are distributed as a power law. Further, I would like to use logarithmic binning to be able to smooth out the large fluctuations in the tail. If I just use logarithmic binning, and plot it on a log log scale, such as

pl.hist(MyList,log=True, bins=pl.logspace(0,3,50))
pl.xscale('log')

for example, then the problem is that the larger bins account for more points, i.e. the heights of my bins are not scaled by bin size.

Is there a way to use logarithmic binning, and yet make python scale all the heights by the size of the bin? I know I can probably do this in some roundabout fashion manually, but it seems like this should be a feature that exists, but I can't seem to find it.


Solution

  • Matplotlib won't help you much if you have special requirements of your histograms. You can, however, easily create and manipulate a histogram with numpy.

    import numpy as np
    from matplotlib import pyplot as plt
    
    # something random to plot
    data = (np.random.random(10000)*10)**3
    
    # log-scaled bins
    bins = np.logspace(0, 3, 50)
    widths = (bins[1:] - bins[:-1])
    
    # Calculate histogram
    hist = np.histogram(data, bins=bins)
    # normalize by bin width
    hist_norm = hist[0]/widths
    
    # plot it!
    plt.bar(bins[:-1], hist_norm, widths)
    plt.xscale('log')
    plt.yscale('log')
    

    Obviously when you do present your data in a non-obvious way like this, you have to be very careful about how to label your y axis properly and write an informative figure caption.