Search code examples
pythonscipycurve-fittingleast-squareslevenberg-marquardt

Why don't these curve fit results match?


I'm attempting to estimate a decay rate using an exponential fit, but I'm puzzled by why the two methods don't give the same result.

In the first case, taking the log of the data to linearize the problem matches Excel's exponential trendline fit. I had expected that fitting the exponential directly would be the same.

import numpy as np
from scipy.optimize import curve_fit

def exp_func(x, a, b):
    return a * np.exp(-b * x) 

def lin_func(x, m, b):
    return m*x + b

xdata = [1065.0, 1080.0, 1095.0, 1110.0, 1125.0, 1140.0, 1155.0, 1170.0, 1185.0, 1200.0, 1215.0, 1230.0, 1245.0, 1260.0, 1275.0, 1290.0, 1305.0, 1320.0, 1335.0, 1350.0, 1365.0, 1380.0, 1395.0, 1410.0, 1425.0, 1440.0, 1455.0, 1470.0, 1485.0, 1500.0]
ydata = [21.3934, 17.14985, 11.2703, 13.284, 12.28465, 12.46925, 12.6315, 12.1292, 10.32762, 8.509195, 14.5393, 12.02665, 10.9383, 11.23325, 6.03988, 9.34904, 8.08941, 6.847, 5.938535, 6.792715, 5.520765, 6.16601, 5.71889, 4.949725, 7.62808, 5.5079, 3.049625, 4.8566, 3.26551, 3.50161]

xdata = np.array(xdata)
xdata = xdata - xdata.min() + 1
ydata = np.array(ydata)
lydata = np.log(ydata)

lopt, lcov = curve_fit(lin_func, xdata, lydata)
elopt = [np.exp(lopt[1]),-lopt[0]]

eopt, ecov = curve_fit(exp_func, xdata, ydata, p0=elopt)

print 'elopt: {},{}'.format(*elopt)
print 'eopt: {},{}'.format(*eopt)

results:

elopt: 17.2526204283,0.00343624199064
eopt: 17.1516384575,0.00330590568338

Solution

  • You're solving two different optimization problems. The curve_fit() assumes that the noise eps_i is additive (and somewhat Gaussian). Else it wont deliver optimal results.

    Assuming that you want to minimize Sum (y_i - f(x_i))**2 with:

    f(x) = a * Exp(-b * x) + eps_i

    where eps_i the unknown error for the i-th data item you want to eliminate. Taking the logarithm results in

    Log(f(x)) = Log(a*Exp(-b*x) + eps_i) != Log(Exp(Log(a) - b*x)) + eps_i

    You can interpret the exponential equation as having additive noise. Your linear version has multiplicative noise mu_i, because:

    g(x) = a * mu_i * Exp(-b*x)

    results in

    Log(g(x) = Log(a) - b * x + Log(mu_i)

    In conclusion, you will only get identical results when the magnitude of the errors eps_i is very small.