Search code examples
pythoncurve-fittingscipy-optimize

How do I apply limits for y-values with curve_fit?


I am currently using my beginner-level knowledge of python for some econometric problems I am facing. Until now, this worked perfectly fine. However, my current problem is finding a graph + function for a few interview answers, for example for the following 6 points:

xvalues = [0, 0.2, 0.4, 0.6, 0.8, 1]
yvalues = [0, 0.15, 0.6, 0.49, 0.51, 1]

I've used curve_fit with mixed results. I have no problem with sigmoid and logarithmic functions. But when it comes to polynomial functions, I need to limit the possible y-values the function can have. For 0 <= x <= 1 the following conditions have to apply (I don't care about x < 0 and x > 1):

  • 0 <= y <= 1
  • Maxima and minima of the function have to be located at said points. This doesn't apply to inflection points, though. Edit for clarity: Maxima and minima have to located only at said points.

as a basis, let's take the following, very simple code that works:

from scipy.optimize import curve_fit

def poly6(x, a, b, c, d, e, f):
    return f * (x ** 6) + e * (x ** 5) + d * (x ** 4) + c * (x ** 3) + b * (x ** 2) + a * (x ** 1)

xvalues = [0, 0.2, 0.4, 0.6, 0.8, 1]
yvalues = [0, 0.15, 0.6, 0.49, 0.51, 1]

x = xvalues
y = yvalues

x_line = arange(min(x), max(x), 1)
popt, _ = curve_fit(poly6, x, y)
a, b, c, d, e, f = popt
print("Poly 6:")
print(popt)

How can I efficiently write these conditions down?

I've tried to find an answer, but with underwhelming success. I found it hard to narrow my problem down to an oneliner that other people already asked.


Solution

  • Using scipy.optimize.minimize to provide provide bounds of the possible y values of your function. I only implemented the limits of y being between 0 and 1. I didn't fully understand what you meant by the maxima/minima of the function having to be in the interval 0 <= x <= 1. Or do you mean minimum has to be at x=0 and maximum at x=1? If that's the case, then it's fairly easy to add two new weights for those situations.

    from scipy.optimize import minimize, curve_fit
    import numpy as np
    import matplotlib.pyplot as plt
    
    xvalues = np.array([0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1])
    yvalues = np.array([0, 0.075, 0.15, 0.375, 0.6, 0.545, 0.49, 0.5, 0.51, 0.755, 1])
    
    def poly6(x, a, b, c, d, e, f):
        return f * (x ** 6) + e * (x ** 5) + d * (x ** 4) + c * (x ** 3) + b * (x ** 2) + a * (x ** 1)
    
    def min_function(params, x, y):
        model = poly6(x, *params)
        residual = ((y - model) ** 2).sum()
        
        if np.any(model > 1):
            residual += 100  # Just some large value
        if np.any(model < 0):
            residual += 100
    
        return residual
    
    res = minimize(min_function, x0=(1, 1, 1, 1, 1, 1), args=(xvalues, yvalues))
    
    plt.plot(xvalues, yvalues, label='data')
    plt.plot(xvalues, poly6(xvalues, *res.x), label='model')
    plt.legend()
    

    This is the resulting fit:

    enter image description here