Search code examples
pythonscipyinterpolation

How to select good knot sequences for "scipy.interpolate.make_lsq_spline"


I want to create a B Spline smoothing a 2D data sequences using scipy.interpolate.make_lsq_spline.

x = [0., 0.37427465, 0.68290943, 0.83261929, 1. ]
y = [-1.0, 3.0, 4.0, 2.0, 1.0] 

But, I don't know how to select proper t, the error message does not make sense for me.

In [1]: import numpy as np

In [2]: from scipy.interpolate import make_lsq_spline

In [3]: x = [0., 0.37427465, 0.68290943, 0.83261929, 1. ]

In [4]: y = [-1.0, 3.0, 4.0, 2.0, 1.0]

In [5]: t = [0.,0.,0.,0.,0.25,0.5,0.75,1.,1.,1.,1 ]

In [6]: spl = make_lsq_spline(x, y, t)
---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
<ipython-input-6-4440a73d26f0> in <cell line: 1>()
----> 1 spl = make_lsq_spline(x, y, t)

/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/scipy/interpolate/_bsplines.py in make_lsq_spline(x, y, t, k, w, axis, check_finite)
   1513
   1514     # have observation matrix & rhs, can solve the LSQ problem
-> 1515     cho_decomp = cholesky_banded(ab, overwrite_ab=True, lower=lower,
   1516                                  check_finite=check_finite)
   1517     c = cho_solve_banded((cho_decomp, lower), rhs, overwrite_b=True,

/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/scipy/linalg/_decomp_cholesky.py in cholesky_banded(ab, overwrite_ab, lower, check_finite)
    280     c, info = pbtrf(ab, lower=lower, overwrite_ab=overwrite_ab)
    281     if info > 0:
--> 282         raise LinAlgError("%d-th leading minor not positive definite" % info)
    283     if info < 0:
    284         raise ValueError('illegal value in %d-th argument of internal pbtrf'

LinAlgError: 5-th leading minor not positive definite

Is there any guideline for selecting proper knot sequense t?


Solution

  • I have a similar problem. Due to your example, I think I can tell what is going wrong. From a linear algebra point of view, you are asking for a solution of a problem which cannot be uniquely solved. You provide 11 knots t which means that there are 11-3-1 = 7 coefficients to be determined since you try to fit with a spline of degree k=3 (default of make_lsq_spline). Evaluating on 5 points x, the left-hand side of your equation system is given by a 5 x 7 matrix D. D is of full rank in the example but this does not help. The 7 x 7 matrix N = D.T@D is only positive semidefinite. Two eigenvalues are 0. It cannot be inverted and your problem therefore cannot be solved uniquely. One solution would be to get rid of 2 knots, say the knots at 0.25 and 0.75. The four-folded knots at the boundaries you should keep when working with splines of degree 3 because you most likely want your interpolation spline to jump there. To sum up, knots got to be chosen in such a way that the interpolation problem is uniquely solvable. I also tried to add some code illustrating what I was trying to say. Hope that helps.

    import numpy as np
    import scipy.interpolate as sciint
    import matplotlib.pyplot as plt
    
    x = [0., 0.37427465, 0.68290943, 0.83261929, 1.]
    y = [-1.0, 3.0, 4.0, 2.0, 1.0]
    t = [0.,0.,0.,0.,0.25,0.5,0.75,1.,1.,1.,1 ]
    
    splines = []
    
    for k in range(7):
        coeff    = np.zeros(7)
        coeff[k] = 1.
        splines.append(sciint.BSpline(t,coeff,3))
    
    fig,ax = plt.subplots(3,3)
    dom    = np.linspace(0.,1.,1000)
    
    for count,axes in enumerate(ax.flat):
        axes.plot(dom,splines[count](dom))
        if count == len(splines)-1:
            break
        
    data = []
    
    for spline in splines:
        data.append(np.vstack(spline(x)))
        
    D = np.hstack(tuple(data))    
    N = D.T @ D
    
    sing = np.linalg.eigvalsh(N)
    
    print(sing)
    
    t2 = [0.,0.,0.,0.,0.5,1.,1.,1.,1 ]
    
    bspline = sciint.make_lsq_spline(x,y,t2)
    
    ax[2,1].plot(x,y,'r')
    ax[2,2].plot(dom,bspline(dom),'m')