Search code examples
statsmodelsarima

What is the formula being used in the in-sample prediction of statsmodels?


I would like to know what formula is being used in statsmodels ARIMA predict/forecast. For a simple AR(1) model I thought that it would be y_t = a1 * y_t-1. However, I am not able to recreate the results produced by forecast or predict.

Here's what I am trying to do:

from statsmodels.tsa.arima.model import ARIMA
import numpy as np

def ar_series(n):
    # generate the series y_t = a1 y_t-1 + eps
    np.random.seed(1)
    y0 = np.random.rand()
    y = [y0]
    a1 = 0.7  # the AR coefficient
    for i in range(1, n):
        y.append(a1 * y[i - 1] + 0.3 * np.random.rand())
    return np.array(y)

series = ar_series(10)
model = ARIMA(series, order=(1, 0, 0))
fit = model.fit()
#print(fit.summary())
# const = 0.3441;   ar.L1 = 0.6518
print(fit.predict())
y_pred = [0.3441]

for i in range(1, 10):
    y_pred.append(  0.6518 * series[i-1])

y_pred = np.array(y_pred)
print(y_pred)

The two series don't match and I have no idea how the in-sample predictions are being calculated?


Solution

  • Found the answer here. I think what I was trying to do is valid only if the process mean is zero. https://faculty.washington.edu/ezivot/econ584/notes/forecast.pdf