Search code examples
pythonarraysscipyspline

Scipy spline interpolation: Determine array length of vector of knots / B-spline coefficients in tck before actual computation


Is it somehow possible to determine the array length of the arrays in the tck tuple returned by scipy.interpolate.splprep before computing the values?

I have to fit a spline interpolation to noisy data with 5 million data points (or less, can be varying). My observation is that the interpolation at an array length of ~ 90 is pretty good, while it takes a long time to compute the interpolation for higher array lengths (it sometimes also directly jumps from ~ 90 to ~ 1000 while making s one step smaller and the interpolation also becomes noisy) and it is not appropriate enough, if the array length is far less (<50)...

Actually, this array length depends on the smoothing factor s provided to the splprep function, but for different measurement data, s varies a lot to get a consistent array length of around 90. E.g. for data1 s has a value of around 1000 to get len(cfk[0]) equals to 90, for data2 s has a value of around 100 to get len(cfk[0]) equals to 90 at same lengths of data1 and data2. It might be dependent on the noise of the data...

I have thought about a loop where s starts at some point and decreases through the loop while len(cfk[0]) is constantly being checked - but this takes ages, especially if len(cfk[0]) gets closer to 90.

Therefore, it would be useful to somehow know the smoothing factor to get the desired array length before computing the cfk tuple.


Solution

  • Short answer: no, not easily. Dierckx Fortran library, which splrep wraps, uses some fairly non-trivial logic for determining the knot vector, and it's all baked into the Fortran code. So, the only way is to carefully trace the latter. It's available from netlib, also scipy/interpolate/fitpack