Search code examples
pythoncurve-fittinggaussianscipy-optimize

How to get the percentage of each Gaussian in a double Gaussian fit?


I am fitting a double Gaussian to my data using scipy.optimize.curve_fit.

import scipy
from scipy.optimize import curve_fit

def gaussian(x, mu, sigma, A):
    return A*np.exp(-(x-mu)**2/2/sigma**2)

def bimodal(x, mu1, sigma1, A1, mu2, sigma2, A2):
    return gaussian(x, mu1, sigma1, A1)+gaussian(x, mu2, sigma2, A2)

def fit_gaussian(n_bins, data):
    bin_heights, bin_borders = np.histogram(np.array(data), bins=n_bins, density=True)
    bin_centers = bin_borders[:-1] + np.diff(bin_borders) / 2
    popt, pcov = scipy.optimize.curve_fit(bimodal, xdata=bin_centers, ydata=bin_heights)
    return bin_borders, popt

bins_fit, popt = fit_gaussian(n_bins=100, data=my_data])

What is the percentage of Gaussian 1 and the percentage of Gaussian 2 in this total population? Is it just A1/(A1+A2)*100 and A2/(A1+A2)*100 or do I need to correct for something?


Solution

  • You need to include information about the sigma information of each Gaussian. When you say percentage coming from each population, I think you are asking about the total counts (integral) in each Gaussian. For a Gaussian in your form, the integral is:

    integral = A * np.sqrt(2*np.pi*sigma**2)
    

    And then the percentage of the population will then be

    integral1 / (integral1 + integral2) * 100