Search code examples
pythonscipycurve-fittingpiecewise

Curvefitting optimization error when fitting piecewise linear function


I have some data in two arrays, which appears to have a break in it. I want my code to figure out where the break is with using piecewise in scipy. Here is what I have:

from scipy import optimize
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

x = np.array([7228,7620,7730,7901,8139,8370,8448,8737,8824,9089,9233,9321,9509,9568,9642,9756,9915,10601,10942], dtype=np.float)
y= np.array([.874,.893,.8905,.8916,.9095,.9142,.9109,.9185,.9169,.9251,.9290,.9304,.9467,.9378,0.9464,0.9508,0.9583,0.9857,0.9975],dtype=np.float)

def piecewise_linear(x, x0, y0, k1, k2):
    return np.piecewise(x, [x < x0], [lambda x:k1*x + y0-k1*x0, lambda x:k2*x + y0-k2*x0])

p , e = optimize.curve_fit(piecewise_linear, x, y)
perr = np.sqrt(np.diag(e))
xd = np.linspace(7228, 11000, 3000)
plt.plot(x, y, "o")
plt.plot(xd, piecewise_linear(xd, *p))

My issue is if I run this, I get an error, "OptimizeWarning: Covariance of the parameters could not be estimated category=OptimizeWarning)". Not sure how to get around this? Is there maybe a way to feed initial parameters into this function to help it converge or similar?

Note, I do realize that the other way I could be getting this to work is interpolating and finding the second derivative of my data. I've already done this, but because my data is not evenly spaced/ the y axis data has some error in it I am interested in getting it to work this way as well for statistical purposes. So, to be clear, what I want here are the parameters for the two lines (slope/intercept), and the inflection point. (Ideally I would love to get an error too on these too, but not sure if that's possible with this method.) Thanks in advance!


Solution

  • The code works perfectly fine, only the initial values are causing problems.

    By default curve_fit starts with all parameters set to 1. Thus, x0 starts way out of range of the x in your data and the optimizer cannot compute a sensible gradient. This small modification will fix the issue:

    # make sure initial x0 and y0 are in range of the data
    p0 = [np.mean(x), np.mean(y), 1, 1]
    
    p , e = optimize.curve_fit(piecewise_linear, x, y, p0)  # set initial parameter estimates
    perr = np.sqrt(np.diag(e))
    xd = np.linspace(7228, 11000, 3000)
    plt.plot(x, y, "o")
    plt.plot(xd, piecewise_linear(xd, *p))
    
    print(p)  # [  9.32099947e+03   9.32965835e-01   2.58225121e-05   4.05400820e-05]
    print(np.diag(e))  # [  4.56978067e+04   5.52060368e-05   3.88418404e-12   7.05010755e-12]
    

    enter image description here