I'm trying to replicate a Stata ARIMA analysis in statsmodels. This analysis used static predictions from its ARIMA (1,0,4) model with four exogenous regressors. I've replicated the model – both Stata and statsmodels return all the same coefficients, z-scores, etc.
The problem I'm having is that statsmodels will only return dynamic predictions.
My data runs from 1980 to 2024, and I'm training on 1980-2018, then forecasting past that. Here's the statsmodels code I'm using to forecast:
forecast_values = model_fit.forecast(steps=32, exog=df[[list of exog vars]],dynamic=False)
Where df is only the out-of-sample observations. This returns dynamic forecasts that do not match Stata's output. I've tried using predict as well, which statsmodels doesn't really expect you to use out-of-sample but this works:
forecast_values = model_fit.predict(start='2017-01-01',end='2024-10-01',exog=merged_testing[[list of exog vars]],dynamic=False)
This returns exactly the same thing as the forecast method. The 'dynamic' attribute doesn't actually seem to do anything – I get the same answers if it is false or true. How can I get static forecasts out of statsmodels? Manually constructing these estimates seems hard, given the 4 MA terms.
EDIT: I've found some clues here and there. The top answer on this question suggests that statsmodels is using dynamic=True because it doesn't have the measured values of the outcome variable, but it doesn't say how you would supply those values to statsmodels for prediction.
Similarly, this issue on the statsmodels git is asking basically my same question but I don't understand the outcome – I don't see how any kind of re-estimation helps me.
Oof, I tried like twenty different statsmodels methods and finally got what I want although I honestly don't really understand from the docs why this works and .predict or .forecast don't.
I finally got what I wanted using .apply(). For my data:
fore=model_fit.apply(endog=oos_df[outcome],exog=oos_df[[exog vars]])
And then I can retrieve the values using:
print(fore.fittedvalues)
Hope this helps someone who is wondering the same thing!