I am currently using my beginner-level knowledge of python for some econometric problems I am facing. Until now, this worked perfectly fine. However, my current problem is finding a graph + function for a few interview answers, for example for the following 6 points:
xvalues = [0, 0.2, 0.4, 0.6, 0.8, 1]
yvalues = [0, 0.15, 0.6, 0.49, 0.51, 1]
I've used curve_fit with mixed results. I have no problem with sigmoid and logarithmic functions. But when it comes to polynomial functions, I need to limit the possible y-values the function can have. For 0 <= x <= 1 the following conditions have to apply (I don't care about x < 0 and x > 1):
as a basis, let's take the following, very simple code that works:
from scipy.optimize import curve_fit
def poly6(x, a, b, c, d, e, f):
return f * (x ** 6) + e * (x ** 5) + d * (x ** 4) + c * (x ** 3) + b * (x ** 2) + a * (x ** 1)
xvalues = [0, 0.2, 0.4, 0.6, 0.8, 1]
yvalues = [0, 0.15, 0.6, 0.49, 0.51, 1]
x = xvalues
y = yvalues
x_line = arange(min(x), max(x), 1)
popt, _ = curve_fit(poly6, x, y)
a, b, c, d, e, f = popt
print("Poly 6:")
print(popt)
How can I efficiently write these conditions down?
I've tried to find an answer, but with underwhelming success. I found it hard to narrow my problem down to an oneliner that other people already asked.
Using scipy.optimize.minimize
to provide provide bounds of the possible y values of your function. I only implemented the limits of y
being between 0 and 1. I didn't fully understand what you meant by the maxima/minima of the function having to be in the interval 0 <= x <= 1
. Or do you mean minimum has to be at x=0
and maximum at x=1
? If that's the case, then it's fairly easy to add two new weights for those situations.
from scipy.optimize import minimize, curve_fit
import numpy as np
import matplotlib.pyplot as plt
xvalues = np.array([0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1])
yvalues = np.array([0, 0.075, 0.15, 0.375, 0.6, 0.545, 0.49, 0.5, 0.51, 0.755, 1])
def poly6(x, a, b, c, d, e, f):
return f * (x ** 6) + e * (x ** 5) + d * (x ** 4) + c * (x ** 3) + b * (x ** 2) + a * (x ** 1)
def min_function(params, x, y):
model = poly6(x, *params)
residual = ((y - model) ** 2).sum()
if np.any(model > 1):
residual += 100 # Just some large value
if np.any(model < 0):
residual += 100
return residual
res = minimize(min_function, x0=(1, 1, 1, 1, 1, 1), args=(xvalues, yvalues))
plt.plot(xvalues, yvalues, label='data')
plt.plot(xvalues, poly6(xvalues, *res.x), label='model')
plt.legend()
This is the resulting fit: