Search code examples
pythonmatplotlibhistogram

Custom Histogram Normalization in matplotlib


I am trying to make a normalized histogram in matplotlib, however I want it normalized such that the total area will be 1000. Is there a way to do this?

I know to get it normalized to 1, you just have to include density=True,stacked=True in the argument of plt.hist(). An equivalent solution would be to do this and multiply the height of each column by 1000, if that would be more doable than changing what the histogram is normalized to.

Thank you very much in advance!


Solution

  • The following approach uses np.histogram to calculate the counts for each histogram bin. Using 1000 / total_count / bin_width as normalization factor, the total area will be 1000. On the contrary, to get the sum of all bar heights to be 1000, a factor of 1000 / total_count would be needed. plt.bar is used to display the end result.

    The example code calculates the same combined histogram with density=True, to compare it with the new histogram summing to 1000.

    import matplotlib.pyplot as plt
    import numpy as np
    
    data = [np.random.randn(100) * 5 + 10, np.random.randn(300) * 4 + 14, np.random.randn(100) * 3 + 17]
    fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(12, 4))
    
    ax1.hist(data, stacked=True, density=True)
    ax1.set_title('Histogram with density=True')
    
    xmin = min([min(d) for d in data])
    xmax = max([max(d) for d in data])
    bins = np.linspace(xmin, xmax, 11)
    bin_width = bins[1] - bins[0]
    
    counts = [np.histogram(d, bins=bins)[0] for d in data]
    total_count = sum([sum(c) for c in counts])
    # factor = 1000 / total_count # to sum to 1000
    factor = 1000 / total_count / bin_width # for an area of 1000
    thousands = [c * factor for c in counts]
    
    bottom = 0
    for t in thousands:
        ax2.bar(bins[:-1], t, bottom=bottom, width=bin_width, align='edge')
        bottom += t
    ax2.set_title('Histogram with total area of 1000')
    
    plt.show()
    

    stacked histogram summing to 1000