I have some noisy time series that represent input parameters as a function of time for a set of ODEs. These noisy time series need to be interpolated and the derivatives of that interpolation function need to be continuous up to the same number of orders as my ODE solver. Otherwise an Nth order ODE solver will try to take extremely small adaptive time steps to deal with the discontinuous jumps in the Nth order derivative. So for example, for the standard RK45 solver in scipy.integrate.solve_ivp, all derivatives of the interpolation function up to at least 5th order must be continuous, I think.
What is the best way to create a smoothing spline in Python such that its first 5 derivatives will be continuous? I am having a strangely difficult time accomplishing this simple task. Below is a minimal working example showing that even when using a 5th order smoothing UnivariateSpline from scipy, the 4th order derivative shows discontinuities/oscillations and the 5th order derivative just explodes (~1e6 magnitude). This is despite me first gaussian-smoothing my noisy time series to help the interpolation function.
I suspect this is at least one reason why the scipy RK45 solver is taking such a long time to solve my system of ODEs -- its taking an unnecessarily large number of smaller timesteps (though I think the fact that I'm using the default solve_ivp tolerances could also be playing a role).
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import lognorm
from scipy.ndimage import gaussian_filter1d
from scipy.interpolate import UnivariateSpline
x = np.sort(np.random.uniform(0.1,10,500))
dist = lognorm(0.5,loc=1.0)
y_noisy = dist.pdf(x) + np.random.uniform(0.0,1,500)
y_smooth = gaussian_filter1d(y_noisy,10,mode='nearest')
spl = UnivariateSpline(x,y_smooth,k=5,s=2)
xx = np.linspace(0.1,10,100000)
yy = spl(xx)
d1 = np.diff(yy) / np.diff(xx)
d2 = np.diff(d1) / np.diff(xx[1:])
d3 = np.diff(d2) / np.diff(xx[1:-1])
d4 = np.diff(d3) / np.diff(xx[1:-2])
d5 = np.diff(d4) / np.diff(xx[1:-3])
fig, axes = plt.subplots(nrows=6,ncols=1,figsize=(8,8))
fig.subplots_adjust(hspace=0.7)
axes[0].plot(x,y_noisy,'k-',lw=2,alpha=0.1)
axes[0].plot(x,y_smooth,'y-',lw=2)
axes[0].plot(xx, yy)
axes[0].set_title('5th-order smoothing UnivariateSpline')
axes[1].plot(xx[1:], d1)
axes[1].set_title('first derivative')
axes[2].plot(xx[1:-1], d2)
axes[2].set_title('second derivative')
axes[3].plot(xx[2:-1], d3)
axes[3].set_title('third derivative')
axes[4].plot(xx[3:-1], d4)
axes[4].set_title('fourth derivative')
axes[5].plot(xx[4:-1], d5)
axes[5].set_title('fifth derivative')
I don't think this is a valid way to check whether a function is continuous at the fifth derivative. I think the timestep is so small that the plots of the fourth and fifth derivatives are getting overwhelmed by floating point errors. The fourth and fifth differences are so tiny that small errors have a big effect.
To test this, I replaced the spline that you fit with np.sin()
, which I know has continuous derivatives.
Sure enough, it says the fourth derivative is jumping around a whole lot. But that can't be - the fourth derivative of sine is just sine.
If I decrease the number of timesteps to 1000, it correctly finds the fourth and fifth derivative. Applying the same technique to the original function, I get this: