I am very confused about how to predict/forecast using ARIMA.
Lets assume we have a series called y_orig
that we split into y_train
and y_test
. Assuming that y_orig
is not stationary, we could fit ARIMA using the code below
# fit ARIMA model
from statsmodels.tsa.arima_model import ARIMA
model = ARIMA(y_train, order=(2,1,2))
model_fit = model.fit(disp=0)
print(model_fit.summary())
After fitting the model, we can predict using the code below
n_periods = len(`y_test`)
fc, -, - = model_fit.forecast(n_periods, alpha=0.05) # 95% conf
The value fc
should give a forecast which i then compare to y_test
. Please note that as expected, y_test
is not used in the training phase. Also note that i am not looking for a rolling forecast but for a long term forecast where the parameters (once trained) are fixed.
I am very confused because y_test
is not used at all in the forecasting phase.
For instance, if we were to use other prediction models (like in Keras or tensorflow). we would be coding it that way.
First, we fit the model in the training phase which i dont show- it does not matter for my question. Then we predict and see how good our fit is in sample
using the code below.
y_pred_train=model.predict(y_train)
then we test the model out of sample
as below:
y_pred_test=model.predict(y_test)
In this situation, the parameters are not re-estimated and y_test
is used in the testing phase to forecast the next value (with fixed parameters).
Hence my confusion with ARIMA. Why do we not do the same with ARIMA model?
Please help me understand as i am very confused.
Thanks so much!!
I think you're a bit confused by the .fit
and the y_train
in the ARIMA
code block. y_train
is just a poorly named variable here, it should just be y
, the data I want to forecast. The ARIMA
model has no training/test phase, it's not self-learning. It does a statistical analysis of the input data, and does a forecast. If you want to do another forecast (on y_test
), you need to do another statistical analysis (using model.fit
) and do another forecast (using model.forecast
). The ARIMA
model does not have any weights
it trains in a training phase, nothing related to any previous data 'fitted' on is saved in the model. You can't use a "fitted" ARIMA
model to forecast other data samples.