I have various spectrograms that I am trying to fit a function to. LMFIT has a composite model feature that I am using, my model is essentially a sum of Voigt or gaussian peaks against a constant background. The initial guesses for the peak centres are found using the scipy peak finder function.
The fit is actually mostly pretty good, even for the data sets with many peaks. My problem is that sometimes (for some larger data sets with more peaks) the errors on the fit report aren't shown, but sometimes they are.
From reading other questions about similar topics I seem to have gathered that this could be due to a parameter not being used, or being pushed to a boundary I have set. In light of this I tried removing all boundaries, and just setting the initial conditions for each peak. It still did not show the errors.
I realise it may be asking a lot to fit 15 or so overlapping gaussian curves to a spectrogram, but since the fit seems to work pretty well by eye, I assume it must work to some degree.
Essentially I'd like to know how to get the errors in the fit report for the datasets with many peaks, just like the data sets with 2 peaks.
Here are two example fit reports:
Two peaks, errors are shown:
[[Model]]
((Model(constant) + Model(voigt, prefix='g0_')) + Model(voigt, prefix='g1_'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 186
# data points = 564
# variables = 9
chi-square = 1342.58147
reduced chi-square = 2.41906571
Akaike info crit = 507.154526
Bayesian info crit = 546.170014
[[Variables]]
g0_sigma: 7.58224056 +/- 0.07787554 (1.03%) (init = 7)
g0_center: 657.390036 +/- 0.02496585 (0.00%) (init = 656.4029)
g0_amplitude: 1549654.58 +/- 5871.81065 (0.38%) (init = 57643.33)
g0_gamma: 3.70825451 +/- 0.10708503 (2.89%) (init = 0.7)
g0_fwhm: 27.3059988 +/- 0.28045426 (1.03%) == '3.6013100*g0_sigma'
g0_height: 57414.5545 +/- 137.261488 (0.24%) == 'g0_amplitude*wofz((1j*g0_gamma)/(g0_sigma*sqrt(2))).real/(g0_sigma*sqrt(2*pi))'
g1_sigma: 7.66546221 +/- 0.55266628 (7.21%) (init = 7)
g1_center: 803.744461 +/- 0.16381855 (0.02%) (init = 803.2903)
g1_amplitude: 315684.011 +/- 6421.21329 (2.03%) (init = 10676.22)
g1_gamma: 6.21905616 +/- 0.67178258 (10.80%) (init = 0.7)
g1_fwhm: 27.6057057 +/- 1.99032260 (7.21%) == '3.6013100*g1_sigma'
g1_height: 9525.49591 +/- 130.820663 (1.37%) == 'g1_amplitude*wofz((1j*g1_gamma)/(g1_sigma*sqrt(2))).real/(g1_sigma*sqrt(2*pi))'
c: 1096.67100 +/- 6.42230443 (0.59%) (init = 0)
[[Correlations]] (unreported correlations are < 0.250)
C(g0_sigma, g0_gamma) = -0.929
C(g1_sigma, g1_gamma) = -0.926
C(g0_amplitude, g0_gamma) = 0.828
C(g1_amplitude, g1_gamma) = 0.821
C(g0_sigma, g0_amplitude) = -0.659
C(g1_sigma, g1_amplitude) = -0.650
then without errors (Ignore the reduced chi-squared etc on this):
[[Model]]
((((((((((((((((((Model(constant) + Model(voigt, prefix='g0_')) + Model(voigt, prefix='g1_')) + Model(voigt, prefix='g2_')) + Model(voigt, prefix='g3_')) + Model(voigt, prefix='g4_')) + Model(voigt, prefix='g5_')) + Model(voigt, prefix='g6_')) + Model(voigt, prefix='g7_')) + Model(voigt, prefix='g8_')) + Model(voigt, prefix='g9_')) + Model(voigt, prefix='g10_')) + Model(voigt, prefix='g11_')) + Model(voigt, prefix='g12_')) + Model(voigt, prefix='g13_')) + Model(voigt, prefix='g14_')) + Model(voigt, prefix='g15_')) + Model(voigt, prefix='g16_')) + Model(voigt, prefix='g17_'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 101758
# data points = 564
# variables = 73
chi-square = 13631.1513
reduced chi-square = 27.7620190
Akaike info crit = 1942.37313
Bayesian info crit = 2258.83209
[[Variables]]
g0_sigma: 192.689563 (init = 7)
g0_center: 422.997773 (init = 389.2612)
g0_amplitude: 1068.20554 (init = 2820.275)
g0_gamma: -618.292505 (init = 0.7)
g0_fwhm: 693.934849 == '3.6013100*g0_sigma'
g0_height: 760.695408 == 'g0_amplitude*wofz((1j*g0_gamma)/(g0_sigma*sqrt(2))).real/(g0_sigma*sqrt(2*pi))'
g1_sigma: 17.9116349 (init = 7)
g1_center: 431.473501 (init = 431.5196)
g1_amplitude: 4525.79900 (init = 3929.55)
g1_gamma: -36.4029462 (init = 0.7)
g1_fwhm: 64.5053499 == '3.6013100*g1_sigma'
g1_height: 1556.62320 == 'g1_amplitude*wofz((1j*g1_gamma)/(g1_sigma*sqrt(2))).real/(g1_sigma*sqrt(2*pi))'
g2_sigma: 4.86138247 (init = 7)
g2_center: 805.214348 (init = 803.2903)
g2_amplitude: 668696.572 (init = 33620)
g2_gamma: 3.52118850 (init = 0.7)
g2_fwhm: 17.5073453 == '3.6013100*g2_sigma'
g2_height: 33446.8793 == 'g2_amplitude*wofz((1j*g2_gamma)/(g2_sigma*sqrt(2))).real/(g2_sigma*sqrt(2*pi))'
g3_sigma: 5.13832814 (init = 7)
g3_center: 1032.12566 (init = 1035.003)
g3_amplitude: 595227.235 (init = 17401.5)
g3_gamma: 8.79672669 (init = 0.7)
g3_fwhm: 18.5047125 == '3.6013100*g3_sigma'
g3_height: 17386.9619 == 'g3_amplitude*wofz((1j*g3_gamma)/(g3_sigma*sqrt(2))).real/(g3_sigma*sqrt(2*pi))'
g4_sigma: 5.06870799 (init = 7)
g4_center: 1160.54035 (init = 1160.308)
g4_amplitude: 74021.3519 (init = 4387.175)
g4_gamma: 5.14599307 (init = 0.7)
g4_fwhm: 18.2539888 == '3.6013100*g4_sigma'
g4_height: 3023.66817 == 'g4_amplitude*wofz((1j*g4_gamma)/(g4_sigma*sqrt(2))).real/(g4_sigma*sqrt(2*pi))'
g5_sigma: 5.71945320 (init = 7)
g5_center: 1270.96477 (init = 1268.509)
g5_amplitude: 458428.440 (init = 15159)
g5_gamma: 7.80744109 (init = 0.7)
g5_fwhm: 20.5975240 == '3.6013100*g5_sigma'
g5_height: 13982.1731 == 'g5_amplitude*wofz((1j*g5_gamma)/(g5_sigma*sqrt(2))).real/(g5_sigma*sqrt(2*pi))'
g6_sigma: 2.6981e-09 (init = 7)
g6_center: 1448.28921 (init = 1352.596)
g6_amplitude: 572109.865 (init = 4058.475)
g6_gamma: 13.5568440 (init = 0.7)
g6_fwhm: 9.7166e-09 == '3.6013100*g6_sigma'
g6_height: 13432.9366 == 'g6_amplitude*wofz((1j*g6_gamma)/(g6_sigma*sqrt(2))).real/(g6_sigma*sqrt(2*pi))'
g7_sigma: 12.5995161 (init = 7)
g7_center: 1351.19388 (init = 1450.939)
g7_amplitude: 64633.5688 (init = 13943)
g7_gamma: -1.76986853 (init = 0.7)
g7_fwhm: 45.3747632 == '3.6013100*g7_sigma'
g7_height: 2297.69041 == 'g7_amplitude*wofz((1j*g7_gamma)/(g7_sigma*sqrt(2))).real/(g7_sigma*sqrt(2*pi))'
g8_sigma: 697.890539 (init = 7)
g8_center: 2422.51062 (init = 1855.019)
g8_amplitude: 210.220409 (init = 2548.15)
g8_gamma: -3083.15825 (init = 0.7)
g8_fwhm: 2513.32018 == '3.6013100*g8_sigma'
g8_height: 4158.40426 == 'g8_amplitude*wofz((1j*g8_gamma)/(g8_sigma*sqrt(2))).real/(g8_sigma*sqrt(2*pi))'
g9_sigma: 83.3565885 (init = 7)
g9_center: 2867.11059 (init = 1926.474)
g9_amplitude: 6021213.64 (init = 2513.125)
g9_gamma: -231.269811 (init = 0.7)
g9_fwhm: 300.192916 == '3.6013100*g9_sigma'
g9_height: 2697769.12 == 'g9_amplitude*wofz((1j*g9_gamma)/(g9_sigma*sqrt(2))).real/(g9_sigma*sqrt(2*pi))'
g10_sigma: 105.224438 (init = 7)
g10_center: 2757.10276 (init = 1940.692)
g10_amplitude: -13348456.2 (init = 2628.65)
g10_gamma: -300.405844 (init = 0.7)
g10_fwhm: 378.945820 == '3.6013100*g10_sigma'
g10_height: -5945307.23 == 'g10_amplitude*wofz((1j*g10_gamma)/(g10_sigma*sqrt(2))).real/(g10_sigma*sqrt(2*pi))'
g11_sigma: 100.413181 (init = 7)
g11_center: 2741.65736 (init = 2226.984)
g11_amplitude: 10844230.4 (init = 2359.375)
g11_gamma: -296.216143 (init = 0.7)
g11_fwhm: 361.618992 == '3.6013100*g11_sigma'
g11_height: 6673387.09 == 'g11_amplitude*wofz((1j*g11_gamma)/(g11_sigma*sqrt(2))).real/(g11_sigma*sqrt(2*pi))'
g12_sigma: 366.846051 (init = 7)
g12_center: 1940.87402 (init = 2450.446)
g12_amplitude: 256322.806 (init = 2311.475)
g12_gamma: -753.396963 (init = 0.7)
g12_fwhm: 1321.12635 == '3.6013100*g12_sigma'
g12_height: 4501.31993 == 'g12_amplitude*wofz((1j*g12_gamma)/(g12_sigma*sqrt(2))).real/(g12_sigma*sqrt(2*pi))'
g13_sigma: 103.888422 (init = 7)
g13_center: 3019.80754 (init = 2667.96)
g13_amplitude: 1616371.56 (init = 4225.95)
g13_gamma: -375.306839 (init = 0.7)
g13_fwhm: 374.134413 == '3.6013100*g13_sigma'
g13_height: 8468440.28 == 'g13_amplitude*wofz((1j*g13_gamma)/(g13_sigma*sqrt(2))).real/(g13_sigma*sqrt(2*pi))'
g14_sigma: 147.814985 (init = 7)
g14_center: 2947.03711 (init = 2700.413)
g14_amplitude: 167.834745 (init = 3302.95)
g14_gamma: -836.092487 (init = 0.7)
g14_fwhm: 532.327584 == '3.6013100*g14_sigma'
g14_height: 8027180.83 == 'g14_amplitude*wofz((1j*g14_gamma)/(g14_sigma*sqrt(2))).real/(g14_sigma*sqrt(2*pi))'
g15_sigma: 145.620972 (init = 7)
g15_center: 2989.06096 (init = 2803.39)
g15_amplitude: -886458.453 (init = 2653.55)
g15_gamma: -516.771454 (init = 0.7)
g15_fwhm: 524.426264 == '3.6013100*g15_sigma'
g15_height: -2636036.61 == 'g15_amplitude*wofz((1j*g15_gamma)/(g15_sigma*sqrt(2))).real/(g15_sigma*sqrt(2*pi))'
g16_sigma: 78.0169344 (init = 7)
g16_center: 3135.45756 (init = 2860.738)
g16_amplitude: -1388405.82 (init = 38420.5)
g16_gamma: -217.130421 (init = 0.7)
g16_fwhm: 280.963166 == '3.6013100*g16_sigma'
g16_height: -680872.187 == 'g16_amplitude*wofz((1j*g16_gamma)/(g16_sigma*sqrt(2))).real/(g16_sigma*sqrt(2*pi))'
g17_sigma: 81.7099268 (init = 7)
g17_center: 3322.62834 (init = 2936.564)
g17_amplitude: 333478.937 (init = 43586)
g17_gamma: -216.876994 (init = 0.7)
g17_fwhm: 294.262776 == '3.6013100*g17_sigma'
g17_height: 109848.342 == 'g17_amplitude*wofz((1j*g17_gamma)/(g17_sigma*sqrt(2))).real/(g17_sigma*sqrt(2*pi))'
c: 1502.83014 (init = 0)
Also here is a cut down version of the code (PseudoPseudoCode?) that doesn't run but shows what I'm trying to do. The actual code I have runs so it's not really a problem with the code, so to speak, but perhaps it'll illustrate what I'm trying to do. I'm sure it's horribly optimised so feel free to mock me in your answer.
#read in data
#decide on stn ratio prominence and other values, the model type to use etc
#find estimate for peak centre, using this scipy function
peaks = scipy.signal.find_peaks(y,height = error*StN, width = width,prominence=PStN)
.
.
.
.
.
#obviously cut out some of the other code but this loops to make a gaussian or a voigt or whatever for each peak found above
model_array = {}
pars = lm.Parameters()
while n < peakamt:
model_array[n] = modeltype(prefix = "g"+str(n)+"_")
pars.update(model_array[n].make_params())
pars['g'+str(n)+'_center'].set(value=xpeaks[n])
pars['g'+str(n)+'_sigma'].set(value=7)
pars['g'+str(n)+'_amplitude'].set(value=ypeaks[n])
if modeltype == VoigtModel:
pars["g"+str(n)+'_gamma'].set(value=0.7, vary=True, expr = '')
#print(n)
n = n + 1
#adds the constant model
const = ConstantModel()
pars.update(const.make_params())
#makes the entire composite model, adding each peak model to the constant model
model = const
m = 0
while m < n:
model = model + model_array[m]
time.sleep(0.1)
print(".")
m = m+1
#here is the fitting
out = model.fit(y, pars, x=x, weights = weight)
final_fit = np.array(out.best_fit)
residuals = final_fit - y
#creating main figure
fig = plt.subplot(2,1,1)
#fig.plot(w,Lorentz)
fig.plot(x,final_fit)
fig.plot(x, y, 'ro', markersize = 2)
fig.errorbar(x,y,yerr=e, linestyle='none')
fig.set_title(str(ElementName) + " Deg " + str(Degrees))
fig.set_ylabel('Intensity (counts)')
fig.set_xlabel('Wavenumber (cm^-1)')
#creating residual figure
res = plt.subplot(2,1,2)
res.plot(x,residuals, 'r+')
res.errorbar(x,residuals,yerr=e,linestyle='none')
res.set_title("Residuals")
#Formatting the figures
plt.tight_layout()
#Output
print(out.fit_report(min_correl=0.25))
print('\n')
plt.show()
Thank you very much for reading this, I hope someone can help me. I apologise if this was torture to read, I don't have much experience with this website or coding in general. I hope it wasn't too bad.
The latest version of lmfit
should include in the report some information about why it could not estimate the uncertainties. Generally, this happens if some parameter goes so far off that its contribution has no effect on the model or fit, or if some parameter gets stuck at a boundary. The report should give some hint of this.
In your case, it looks like the gamma
parameters have all gone very negative and completely crazy. For the Voigt function, gamma < 0
is not exactly "absurd", but it does make the function not-exactly a "peak", more like a mixture of "sharp Lorentzian minus broader Gaussian". You might want to put a lower bound on gamma
, perhaps 0.
There are other possibilities like constraining gamma
to be some factor times sigma
and using the same factor for all peak, say like
pars = lm.Parameters()
pars.add('gamma_scale', value=0.7, min=0, max=100, vary=True)
while n < peakamt:
pref = 'g%d_' % n
model_array[n] = modeltype(prefix =pref)
pars.update(model_array[n].make_params())
pars['%s_center' % pref].set(value=xpeaks[n])
pars['%s_sigma' % pref].set(value=7)
pars['%s_amplitude' % pref].set(value=ypeaks[n])
if modeltype == VoigtModel:
pars['%s_gamma' % pref].set(expr='gamma_scale*%s_sigma' % pref)
n = n + 1
That will still allow the Voigt peaks to have some variability with gamma
, but constrain them to all have the same amount of gamma
-ness. Whether that is what you really want to do probably depends on the nature of your data.