I am trying to implement lmfit
into my fitting routines, and I am having issues defining the errors. I premise that I read previous questions regarding the topic on this platform, and I also went through the docs, but some of my doubts are still there.
Below is a complete and minimal example of what I am trying to achieve.
import corner
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import scipy as sp
import lmfit
font = {'fontname':'candara', "fontweight":"light"}
plt.style.use('ggplot')
plt.rcParams["figure.figsize"] = (8,4)
ax_fit_kws = dict(xlim=(0,0.12), ylim=(0.5,1.1))
ax_res_kws = dict(xlim=(0,0.12), ylim=(-0.1,0.1))
def mono_exp(SL_array, a, b):
model = a * np.exp(-b * SL_array)
return model
model = lmfit.Model(mono_exp)
SL_array = np.array((0.030, 0.040, 0.060, 0.080, 0.10))
data= np.array((1., 0.9524336, 0.92452666, 0.87995659, 0.82845576))
errs = np.array((0.00029904, 0.00049384, 0.00076344, 0.00053886, 0.00066012))
params = model.make_params(a=0, b=0)
result = model.fit(data=data, params=params, SL_array=SL_array, method="Nelder", markersize=10, weights=errs)
lmfit.report_fit(result)
result.plot(yerr = errs, ax_fit_kws=ax_fit_kws, ax_res_kws=ax_res_kws)
emcee_kws = dict(steps=400, burn=30, thin=20, is_weighted=False,
progress=True)
emcee_params = result.params.copy()
emcee_params.add('__lnsigma', value=np.log(0.1), min=np.log(0.001), max=np.log(2.0))
result_emcee = model.fit(data=data, SL_array=SL_array, params=emcee_params, method='emcee',
nan_policy='omit', fit_kws=emcee_kws)
lmfit.report_fit(result_emcee)
ax = plt.plot(SL_array, model.eval(params=result.params, SL_array=SL_array), label='Nelder', zorder=100)
result_emcee.plot_fit(ax=ax, data_kws=dict(color='gray', markersize=10), yerr = errs)
emcee_corner = corner.corner(result_emcee.flatchain, labels=result_emcee.var_names,
truths=list(result_emcee.params.valuesdict().values()))
plt.show()
My problem is rather simple: I would like the initial Nelder
fitting routine to take into account the errs
array as errors on the data
array (which are my experimentally determined points). I am not sure that calling weights=errs
achieves this goal. I have tried the solution implemented here: How do I include errors for my data in the lmfit least squares miniimization, and what is this error for conf_interval2d function in lmfit?
But I could not make it work.
Another point which is not really clear to me is: does the emcee
part of my fit take into account the residuals from the Nelder
routine?
Many thanks in advance!
EDIT
After a bit more of research, I now believe that when calling weights I should actually give 1/err
. By implementing this change, and applying scale_covar=False
as described elsewhere (How to properly get the errors in lmfit) while artifically incrementing the error values (for instance, deliberate multiplication of the errs
array by 100-fold) I do get the fitted parameters errors to increase substantially, which is an expected behaviour. In short: result = model.fit(data=data, params=params, SL_array=SL_array, method="Nelder", markersize=10, weights=errs)
was changed to result = model.fit(data=data, params=params, SL_array=SL_array, method="Nelder", markersize=10, weights=1/errs)
. Is this correct?
I am still rather confused on the implementation of emcee
in my case.
What you say in your edits is correct: You want to use weights=1./err
to properly weight the residual of data
and model
by the uncertainties in the data, err
.
You probably want to use the same in your call to model.fit(..., method='emcee')
too.
I should say that the use of emcee
in lmfit
is rather confusing and gives the unfortunate impression that it is doing a fit. This is simply not true, as emcee
(and, really MCMC as a method) can not really do a fit in the sense of "systematically refine parameter values in order to find an improved solution". What it is doing is exploring parameter space in the vicinity of the input parameter values (that happen to be the solution from the Nelder
method).
This exploration may find (more like "stumble upon" than "seek") an improved solution and the results will reflect the exploration it does.