I am fitting a double Gaussian to my data using scipy.optimize.curve_fit.
import scipy
from scipy.optimize import curve_fit
def gaussian(x, mu, sigma, A):
return A*np.exp(-(x-mu)**2/2/sigma**2)
def bimodal(x, mu1, sigma1, A1, mu2, sigma2, A2):
return gaussian(x, mu1, sigma1, A1)+gaussian(x, mu2, sigma2, A2)
def fit_gaussian(n_bins, data):
bin_heights, bin_borders = np.histogram(np.array(data), bins=n_bins, density=True)
bin_centers = bin_borders[:-1] + np.diff(bin_borders) / 2
popt, pcov = scipy.optimize.curve_fit(bimodal, xdata=bin_centers, ydata=bin_heights)
return bin_borders, popt
bins_fit, popt = fit_gaussian(n_bins=100, data=my_data])
What is the percentage of Gaussian 1 and the percentage of Gaussian 2 in this total population? Is it just A1/(A1+A2)*100 and A2/(A1+A2)*100 or do I need to correct for something?
You need to include information about the sigma
information of each Gaussian. When you say percentage coming from each population, I think you are asking about the total counts (integral) in each Gaussian. For a Gaussian in your form, the integral is:
integral = A * np.sqrt(2*np.pi*sigma**2)
And then the percentage of the population will then be
integral1 / (integral1 + integral2) * 100