So I wad reading the documentation about curve_fit
here. It contains the following example:
import numpy as np
import scipy.optimize as so
def func(x, a,b,c ):
return a * np.exp(-b * x) + c
a,b,c = 2.5, 1.3, 0.5
nx = 500
noiseAlpha = 0.5
#
xdata = np.linspace(0, 4, nx)
y = func(xdata, a,b,c)
ydata = y + noiseAlpha * np.random.normal(size=len(xdata))
If I call curve_fit
now, it will approximate the derivatives since I didn't provide anything. Let's time it:
%%timeit
popt, pcov = so.curve_fit(func, xdata, ydata )
1000 loops, best of 3: 787 µs per loop
In fact curve_fit
calls leastsq
(doc here), which accepts a Dfun
argument for computing the Jacobian. So I did this:
def myDfun( abc, xdata, ydata, f ) :
a,b,c = abc
ebx = np.exp(-b * xdata)
res = np.vstack( ( ebx, a * -xdata * ebx, np.ones(len(xdata)) ) ).T
return res
And I timed it again:
%%timeit
popt, pcov = so.curve_fit(func, xdata, ydata, Dfun=myDfun )
1000 loops, best of 3: 858 µs per loop
Uh? Using the Jacobian is slower than approximating it? Did i do something wrong?
Not really an answer, but my feeling is it dependents on the size of the problem. For small size (n=500), the extra time spend in evaluating jacobian (with the supplied jac) probably don't pay off in the end.
n=500, with jab:
100 loops, best of 3: 1.50 ms per loop
Without:
100 loops, best of 3: 1.57 ms per loop
n=5000, with jab:
100 loops, best of 3: 5.07 ms per loop
Without:
100 loops, best of 3: 6.46 ms per loop
n=50000, with jab:
100 loops, best of 3: 49.1 ms per loop
Without:
100 loops, best of 3: 59.2 ms per loop
Also you may want to consider rewrite the jacobian function, for example getting rid of the expensive .T()
can bring about ~15% speed up:
def myDfun2( abc, xdata, ydata, f ) :
a,b,c = abc
ebx = np.exp(-b * xdata)
res = np.hstack( ( ebx.reshape(-1,1), (a * -xdata * ebx).reshape(-1,1), np.ones((len(xdata),1)) ) )
return res