I’m trying to plot a curve_fit for the S&P 500.
I’m successful (I think) at performing a linear fit/plot. When I try to get an exponential curve_fit to work, I get this error:
Optimal parameters not found: Number of calls to function has reached maxfev = 800.
import numpy as np
import matplotlib.pyplot as plt
import yfinance as yf
from scipy.optimize import curve_fit
# get data
df = yf.download("SPY", interval = '1mo')
df = df.reset_index()
def func(x, a, b):
return a * x + b
# ?? return a * np.exp(-b * x) + c
# ?? return a*(x**b)+c
# ?? return a*np.exp(b*x)
# create data arrays
# convert Date to numeric for curve_fit ??
xdata = df['Date'].to_numpy().astype(np.int64)//10**9
ydata = df['Close'].to_numpy()
# p0 = (?, ?, ?) use guesses?
popt, pcov = curve_fit(func, xdata, ydata)
print(popt)
y_pred = func(xdata, *popt)
plt.plot(xdata, ydata)
plt.plot(xdata, y_pred, '-')
plt.show()
Am I dealing with dates correctly?
Should I be doing a p0 initial guess?
This question/solution may provide some clues.
It would be nice to have the x-axis labeled in a date format (but not important right now).
In addition to normalizing the data, it is important to actually choose a good function. In your example you had:
def func(x, a, b):
return a * x + b
# ?? return a * np.exp(-b * x) + c
# ?? return a*(x**b)+c
# ?? return a*np.exp(b*x)
The correct one, when you say you want to fit an exponential, should be this, IMO:
# define the exponential growth function, before you had exponential decay because of -b
def ExponentialGrowth(x, a, b, c):
return a * np.exp(b * x) + c # + c due to account for offset
The power function might work as well, I did not check. Anyways, here's the code:
# define the exponential growth function, before you had exponential decay because of -b
def ExponentialGrowth(x, a, b, c):
return a * np.exp(b * x) + c # + c due to account for offset
# get data
x = df['Date'].to_numpy().astype(np.int64)//10**9
y = df['Close'].to_numpy()
# apply z normalization
xNorm = (x - x.mean()) / x.std()
yNorm = (y - y.mean()) / y.std()
# get the optimal parameters
popt, pcov = curve_fit(ExponentialGrowth, xNorm, yNorm)
# get the predicted but in the normalized range
yPredNorm = ExponentialGrowth(xNorm, *popt)
# reverse normalize the predicted values
yPred = yPredNorm * (y.std()) + y.mean()
plt.figure()
plt.scatter(df['Date'], y, 1)
plt.plot(df['Date'], yPred, 'r-')
plt.grid()
plt.legend(["Raw", "Fitted"])
plt.xlabel("Year")
plt.ylabel("Close")
And the results:
If you will eventually need to get initial guesses, you can search online how to get the initial guesses for any function. For example, if I am fitting an exponential growth function and I know that the data has an offset of 100, I can set the initial guess of c
to a 100...
Hope this helps you.