Why SARIMA has seasonal limits?

The original ARMA algorithm has the following formula:

$y_t=\alpha+\phi_1y_{t-1}+\phi_2y_{t-2}+...+\phi_py_{t-p}+\\\varepsilon_t+\theta_1\varepsilon_{t-1}+\theta_2\varepsilon_{t-2}+...+\theta_q\varepsilon_{t-q}$

And here you can see, that ARMA takes p + q + 1 numbers to compute. So, there is no questions about that, that's pretty clear.

But talking about SARIMA algorithm I can't understand one thing. The SARIMA formula is looks like ARMA with exta:

$+\phi_Sy_{t-S}+\phi_{2S}y_{t-2S}+...+\phi_{PS}y_{t-PS}+\\\theta_S\varepsilon_{t-S}+\theta_{2S}\varepsilon_{t-2S}+...+\theta_{QS}\varepsilon_{t-QS}$

Where S is a number which is stands for seasonal period. S is constant.

So, SARIMA must compute p + q + P + Q + 1 numbers. Just exta P + Q numbers. Not too much, if P = 1 and Q = 2.

But if we use too long period, for example 365 days for everyday time series, SARIMA just can't stop fitting. Look at this to models. The first one takes 9 seconds to fit, while the second one haven't finished fitting after 2 hours!

import statsmodels.api as sm

model = sm.tsa.statespace.SARIMAX(
    df.meantemp_box, 
    order=(1, 0, 2),
    seasonal_order=(1, 1, 1, 7)
).fit()

model = sm.tsa.statespace.SARIMAX(
    df.meantemp_box, 
    order=(1, 0, 2),
    seasonal_order=(1, 1, 1, 365)
).fit()

And I can't understand that. Mathematically this models are the same - they both takes the same p, q, P and Q. But the second one either takes too long to learn, or is not able to learn at all.

Am I getting something wrong?

Solution

First, a possible solution: if you are using Statsmodels v0.11 or the development version, then you can use the following when you have long seasonal effects:

mod = sm.tsa.arima.ARIMA(endog, order=(1, 0, 2), seasonal_order=(1, 1, 1, 365))
res = mod.fit(method='innovations_mle', low_memory=True, cov_type='none')

The main restriction is that your time series cannot have missing entries. If you are missing values, then you will need to impute them somehow before creating the model.

Also, not all of our results features will be available to you, but you can still print the summary with the parameters, compute the loglikelihood and information criteria, compute in-sample predictions, and do out-of-sample forecasting.

Now to explain what the problem is:

The problem is that these models are estimated by putting them in state space form and then applying the Kalman filter to compute the log-likelihood. The dimension of the state space form of an ARIMA model grows quickly with the number of periods in a complete season - for your model with s=365, the dimension of the state vector is 733.

The Kalman filter requires multiplying matrices with this dimension, and by default, memory is allocated for matrices of this dimension for each of your time periods. That's why it takes forever to run (and it takes up a lot of memory too).

For the solution above, instead of computing the log-likelihood using the Kalman filter, we compute it something called the innovations algorithm. Then we only run the Kalman filter once to compute the results object (this allows for e.g. forecasting). The low_memory=True option instructs the model not to store all of the large-dimensional matrices for each time step, and the cov_type=None option instructs the model not to try to compute standard errors for the model's parameters (which would require a lot more log-likelihood evaluations).