Search code examples
pythonmatplotlibseabornhistogram

Plot a histogram such that bar heights sum to 1 (probability)


I'd like to plot a normalized histogram from a vector using matplotlib. I tried the following:

plt.hist(myarray, normed=True)

as well as:

plt.hist(myarray, normed=1)

but neither option produces a y-axis from [0, 1] such that the bar heights of the histogram sum to 1.


Solution

  • It would be more helpful if you posed a more complete working (or in this case non-working) example.

    I tried the following:

    import numpy as np
    import matplotlib.pyplot as plt
    
    x = np.random.randn(1000)
    
    fig = plt.figure()
    ax = fig.add_subplot(111)
    n, bins, rectangles = ax.hist(x, 50, density=True)
    fig.canvas.draw()
    plt.show()
    

    This will indeed produce a bar-chart histogram with a y-axis that goes from [0,1].

    Further, as per the hist documentation (i.e. ax.hist? from ipython), I think the sum is fine too:

    *normed*:
    If *True*, the first element of the return tuple will
    be the counts normalized to form a probability density, i.e.,
    ``n/(len(x)*dbin)``.  In a probability density, the integral of
    the histogram should be 1; you can verify that with a
    trapezoidal integration of the probability density function::
    
        pdf, bins, patches = ax.hist(...)
        print np.sum(pdf * np.diff(bins))
    

    Giving this a try after the commands above:

    np.sum(n * np.diff(bins))
    

    I get a return value of 1.0 as expected. Remember that normed=True doesn't mean that the sum of the value at each bar will be unity, but rather than the integral over the bars is unity. In my case np.sum(n) returned approx 7.2767.