I have a dataframe like this
lstvals = [30.81,27.16,82.15,31.00,9.13,11.77,25.58,7.57,7.98,7.98]
lstdates = ['2021-01-01', '2021-01-05', '2021-01-09', '2021-01-13', '2021-01-17', '2021-01-21', '2021-01-25', '2021-01-29', '2021-02-02', '2021-02-06']
data = {
"Dates": lstdates,
"Market Value": lstvals
}
df = pd.DataFrame(data)
df.set_index('Dates', inplace = True)
df
I want to forecast the values which are out of this sample, for example, from '2021-02-10' to '2022-04-23' (in my dataset, I have data from '2021-01-01' to '2023-11-09', and want to forecast for next year, from '2024-01-01' to '2023-11-09)
https://www.statsmodels.org/devel/examples/notebooks/generated/statespace_forecasting.html
I have defined and fitted my model as follows, which predicts the test data:
train = df['Market Value'].iloc[:1187]
test = df['Market Value'].iloc[-200:]
...
ARMAmodel = SARIMAX(y, order = (2,1,2))
ARMAResults = ARMAmodel.fit()
...
y_pred = ARMAResults.get_forecast(len(test.index))
y_pred_df = y_pred.conf_int(alpha = 0.05)
y_pred_df["Predictions"] = ARMAResults.predict(start = y_pred_df.index[0], end = y_pred_df.index[-1])
y_pred_df.index = test.index
y_pred_out = y_pred_df["Predictions"]
...
plt.plot(train, color = "black")
plt.plot(test, color = "red")
plt.ylabel('Market Value ($M)')
plt.xlabel('Date')
plt.xticks(rotation=45)
plt.title("Train/Test/Prediction for Market Data")
plt.plot(y_pred_out, color='green', label = 'Predictions')
plt.legend()
plt.show()
How can I make predictions for future dates?
I have just tried to input future dates with the forecast method, and apparently, it is not working for me
ARMAResults.forecast(start = '2024-01-01', end = '2024-11-09')
TypeError: statsmodels.tsa.statespace.mlemodel.MLEResults.predict() got multiple values for keyword argument 'start'
https://www.statsmodels.org/devel/examples/notebooks/generated/statespace_forecasting.html
Issues:
df = pd.DataFrame(data)
df['Dates'] = pd.to_datetime(df['Dates'])
df.set_index('Dates', inplace=True)
df = df.asfreq('4D')
forecast
is strictly for out-of-sample forecasts, and has no start
or end
parameters. (Note that its steps
parameter can be passed a string or datetime type.) Either use predict
or get_prediction
, which support both in-sample and out-of-sample results.
ARMAResults.predict(start='2024-01-01', end='2024-11-09')
or
# mean
ARMAResults.get_prediction(start='2024-01-01', end='2024-11-09').predicted_mean
# mean, standard error, prediction interval
ARMAResults.get_prediction(start='2024-01-01', end='2024-11-09').summary_frame()