Search code examples
pythonscipyinterpolationbspline

scipy splrep() with weights not fitting the given curve


Using scipy's splrep I can easily fit a test sinewave:

import numpy as np
from scipy.interpolate import splrep, splev
import matplotlib.pyplot as plt
plt.style.use("ggplot")

# Generate test sinewave
x = np.arange(0, 20, .1)
y = np.sin(x)

# Interpolate
tck = splrep(x, y)
x_spl = x + 0.05 # Just to show it wors
y_spl = splev(x_spl, tck)
plt.plot(x_spl, y_spl)

Spline sinewave plot

The splrep documentation states that the default value for the weight parameter is np.ones(len(x)). However, plotting this results in a totally different plot:

tck = splrep(x, y, w=np.ones(len(x_spl)))
y_spl = splev(x_spl, tck)
plt.plot(x_spl, y_spl)

Plot with splev and weights

The documentation also states that the smoothing condition s is different when a weight array is given - but even when setting s=len(x_spl) - np.sqrt(2*len(x_spl)) (the default value without a weight array) the result does not strictly correspond to the original curve as shown in the plot.

What do I need to change in the code listed above in order to make the interpolation with weight array (as listed above) output the same result as the interpolation without the weights? I have tested this with scipy 0.17.0. Gist with a test IPython notebook


Solution

  • You only have to change one line of your code to get the identical output:

    tck = splrep(x, y, w=np.ones(len(x_spl)))
    

    should become

    tck = splrep(x, y, w=np.ones(len(x_spl)), s=0)
    

    So, the only difference is that you have to specify s instead of using the default one.

    When you look at the source code of splrep you will see why that is necessary:

    if w is None:
        w = ones(m, float)
        if s is None:
            s = 0.0
    
    else:
        w = atleast_1d(w)
        if s is None:
            s = m - sqrt(2*m)
    

    which means that, if neither weights nor s are provided, s is set to 0 and if you provide weights but no s then s = m - sqrt(2*m) where m = len(x).

    So, in your example above you compare outputs with the same weights but with different s (which are 0 and m - sqrt(2*m), respectively).