Search code examples
pythonnumpymatplotlibscipyscipy-optimize

How to match a Gaussian normal to a histogram?


I'm wondering if there is a good way to match a Gaussian normal to a histogram in the form of a numpy array np.histogram(array, bins).

How can such a curve been plotted on the same graph and adjusted in height and width to the histogram?


Solution

  • You can fit your histogram using a Gaussian (i.e. normal) distribution, for example using scipy's curve_fit. I have written a small example below. Note that depending on your data, you may need to find a way to make good guesses for the starting values for the fit (p0). Poor starting values may cause your fit to fail.

    import numpy as np
    from scipy.optimize import curve_fit
    import matplotlib.pyplot as plt
    from scipy.stats import norm
    
    def fit_func(x,a,mu,sigma,c):
        """gaussian function used for the fit"""
        return a * norm.pdf(x,loc=mu,scale=sigma) + c
    
    #make up some normally distributed data and do a histogram
    y = 2 * np.random.normal(loc=1,scale=2,size=1000) + 2
    no_bins = 20
    hist,left = np.histogram(y,bins=no_bins)
    centers = left[:-1] + (left[1] - left[0])
    
    #fit the histogram
    p0 = [2,0,2,2] #starting values for the fit
    p1,_ = curve_fit(fit_func,centers,hist,p0,maxfev=10000)
    
    #plot the histogram and fit together
    fig,ax = plt.subplots()
    ax.hist(y,bins=no_bins)
    x = np.linspace(left[0],left[-1],1000)
    y_fit = fit_func(x, *p1)
    ax.plot(x,y_fit,'r-')
    plt.show()
    

    Histogram with Gaussian fit