Search code examples
pythonpython-3.xgaussianlmfit

Finding uncertainty, reduced chi-square for a gaussian fit in Python


I tried computing the standard errors for my data points for a Gaussian fit. I want to know how to calculate the errors and obtain the uncertainty. I want to compute the value of the reduced (chi-squared). Any suggestions would help. any suggestions and help in correcting the residual(p,x,y) would really help.

import csv
import pandas as pd
import numpy as np
from scipy import optimize
from matplotlib import pyplot as plt
from scipy.optimize import curve_fit
from scipy import asarray as ar,exp
#reading the x,y,z values from the respective csv files
xData = []
yData = []
path = r'C:\Users\angel\OneDrive\Documents\CSV_FILES_NV_LAB\1111 x 30.csv'
with open(path, "r") as f_in:
    reader = csv.reader(f_in)
    next(reader)  # skip headers

    for line in reader:
        try:
            float_1, float_2=float(line[0]),float(line[1])
            xData.append(float_1)
            yData.append(float_2)
        except ValueError:
            continue
#printing the columns of the csv files and storing as an array
print(xData)
print(yData)
#plot the data points
plt.plot(xData,yData,'bo',label='experimental_data')
plt.show()
#define the function we want to fit the plot into
# Define the Gaussian function
n = len(xData)
xData=np.array(xData)
yData=np.array(yData)
mean = sum(xData*yData)/sum(yData)
sigma = np.sqrt(sum(yData*(xData-mean)**2)/sum(yData))
def Gauss(x,I0,x0,sigma,Background):
    return I0*exp(-(x-x0)**2/(2*sigma**2))+Background

#popt,pcov = curve_fit(Gauss,xData,yData,p0=[1,mean,sigma, 0.0])
#calculating error methods

################################################################################
def residual(p,x,y):
    return Gaussian(x,*p)-y
initGuess=[1,1,1]
popt,pcov,infodict,mesg,ier=optimize.least_squares(residual,initGuess,args=[x,y],full_output=True)
s_sq = (infodict['fvec']**2).sum()/ (N-n)
#####################################################################################
print(popt)
plt.plot(xData,yData,'b+:',label='data')
plt.plot(xData,Gauss(xData,*popt),'ro:',label='fit')
plt.legend()
plt.title('Gaussian_Fit')
plt.xlabel('x-axis')
plt.ylabel('PL Intensity')
plt.show()

Sample input data:

xData=[
    -7.66e-06, -7.6e-06, -7.53e-06, -7.46e-06, -7.4e-06,
    -7.33e-06, -7.26e-06, -7.19e-06
]
yData=[
    17763.0, 2853.0, 3694.0, 4203.0, 4614.0, 4984.0,
    5080.0, 7038.0, 6905.0
]

Output Error:

popt,pcov,infodict,mesg,ier=optimize.least_squares(
    residual,initGuess,args=[x,y],full_output=True
)
NameError: name 'x' is not defined.

Solution

  • Readable code always helps. Like, the error you are getting is

    popt,pcov,infodict,mesg,ier=optimize.least_squares(residual,initGuess,args=[x,y],full_output=True) 
    NameError: name 'x' is not defined.
    

    pretty much is telling you that 'x' is not defined. Perhaps you meant 'xData'?

    I recommend starting with the easier-to-use lmfit. With that, a fit to Gaussian data in a CSV file might look like this:

    from pandas import read_csv
    from lmfit.models import GaussianModel
    from matplotlib import pyplot as plt
    
    dframe = read_csv('peak.csv')
    xdata = dframe.to_numpy()[:, 0]
    ydata = dframe.to_numpy()[:, 1]
    
    model = GaussianModel()
    params = model.guess(ydata, x=xdata)
    result = model.fit(ydata, params, x=xdata)
    
    print(f'Chi-square = {result.chisqr:.4f}, Reduced Chi-square = {result.redchi:.4f}')
    
    print(result.fit_report())
    
    plt.plot(xdata, ydata, label='data')
    plt.plot(xdata, result.best_fit, label='best fit')
    plt.legend()
    plt.show()
    

    That print(result.fit_report()) will print out a something like this:

    [[Model]]
        Model(gaussian)
    [[Fit Statistics]]
        # fitting method   = leastsq
        # function evals   = 21
        # data points      = 101
        # variables        = 3
        chi-square         = 7.60712520
        reduced chi-square = 0.07762373
        Akaike info crit   = -255.189553
        Bayesian info crit = -247.344192
    [[Variables]]
        amplitude:  30.5250840 +/- 0.31978873 (1.05%) (init = 40.626)
        center:     9.22348190 +/- 0.01498559 (0.16%) (init = 9.3)
        sigma:      1.23877032 +/- 0.01498552 (1.21%) (init = 1.3)
        fwhm:       2.91708114 +/- 0.03528820 (1.21%) == '2.3548200*sigma'
        height:     9.83051253 +/- 0.10298786 (1.05%) == '0.3989423*amplitude/max(1e-15, sigma)'
    [[Correlations]] (unreported correlations are < 0.100)
        C(amplitude, sigma) =  0.577