I tried computing the standard errors for my data points for a Gaussian fit. I want to know how to calculate the errors and obtain the uncertainty. I want to compute the value of the reduced (chi-squared). Any suggestions would help. any suggestions and help in correcting the residual(p,x,y) would really help.
import csv
import pandas as pd
import numpy as np
from scipy import optimize
from matplotlib import pyplot as plt
from scipy.optimize import curve_fit
from scipy import asarray as ar,exp
#reading the x,y,z values from the respective csv files
xData = []
yData = []
path = r'C:\Users\angel\OneDrive\Documents\CSV_FILES_NV_LAB\1111 x 30.csv'
with open(path, "r") as f_in:
reader = csv.reader(f_in)
next(reader) # skip headers
for line in reader:
try:
float_1, float_2=float(line[0]),float(line[1])
xData.append(float_1)
yData.append(float_2)
except ValueError:
continue
#printing the columns of the csv files and storing as an array
print(xData)
print(yData)
#plot the data points
plt.plot(xData,yData,'bo',label='experimental_data')
plt.show()
#define the function we want to fit the plot into
# Define the Gaussian function
n = len(xData)
xData=np.array(xData)
yData=np.array(yData)
mean = sum(xData*yData)/sum(yData)
sigma = np.sqrt(sum(yData*(xData-mean)**2)/sum(yData))
def Gauss(x,I0,x0,sigma,Background):
return I0*exp(-(x-x0)**2/(2*sigma**2))+Background
#popt,pcov = curve_fit(Gauss,xData,yData,p0=[1,mean,sigma, 0.0])
#calculating error methods
################################################################################
def residual(p,x,y):
return Gaussian(x,*p)-y
initGuess=[1,1,1]
popt,pcov,infodict,mesg,ier=optimize.least_squares(residual,initGuess,args=[x,y],full_output=True)
s_sq = (infodict['fvec']**2).sum()/ (N-n)
#####################################################################################
print(popt)
plt.plot(xData,yData,'b+:',label='data')
plt.plot(xData,Gauss(xData,*popt),'ro:',label='fit')
plt.legend()
plt.title('Gaussian_Fit')
plt.xlabel('x-axis')
plt.ylabel('PL Intensity')
plt.show()
Sample input data:
xData=[
-7.66e-06, -7.6e-06, -7.53e-06, -7.46e-06, -7.4e-06,
-7.33e-06, -7.26e-06, -7.19e-06
]
yData=[
17763.0, 2853.0, 3694.0, 4203.0, 4614.0, 4984.0,
5080.0, 7038.0, 6905.0
]
Output Error:
popt,pcov,infodict,mesg,ier=optimize.least_squares(
residual,initGuess,args=[x,y],full_output=True
)
NameError: name 'x' is not defined.
Readable code always helps. Like, the error you are getting is
popt,pcov,infodict,mesg,ier=optimize.least_squares(residual,initGuess,args=[x,y],full_output=True)
NameError: name 'x' is not defined.
pretty much is telling you that 'x' is not defined. Perhaps you meant 'xData'?
I recommend starting with the easier-to-use lmfit
. With that, a fit to Gaussian data in a CSV file might look like this:
from pandas import read_csv
from lmfit.models import GaussianModel
from matplotlib import pyplot as plt
dframe = read_csv('peak.csv')
xdata = dframe.to_numpy()[:, 0]
ydata = dframe.to_numpy()[:, 1]
model = GaussianModel()
params = model.guess(ydata, x=xdata)
result = model.fit(ydata, params, x=xdata)
print(f'Chi-square = {result.chisqr:.4f}, Reduced Chi-square = {result.redchi:.4f}')
print(result.fit_report())
plt.plot(xdata, ydata, label='data')
plt.plot(xdata, result.best_fit, label='best fit')
plt.legend()
plt.show()
That print(result.fit_report())
will print out a something like this:
[[Model]]
Model(gaussian)
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 21
# data points = 101
# variables = 3
chi-square = 7.60712520
reduced chi-square = 0.07762373
Akaike info crit = -255.189553
Bayesian info crit = -247.344192
[[Variables]]
amplitude: 30.5250840 +/- 0.31978873 (1.05%) (init = 40.626)
center: 9.22348190 +/- 0.01498559 (0.16%) (init = 9.3)
sigma: 1.23877032 +/- 0.01498552 (1.21%) (init = 1.3)
fwhm: 2.91708114 +/- 0.03528820 (1.21%) == '2.3548200*sigma'
height: 9.83051253 +/- 0.10298786 (1.05%) == '0.3989423*amplitude/max(1e-15, sigma)'
[[Correlations]] (unreported correlations are < 0.100)
C(amplitude, sigma) = 0.577