python time-series forecasting statsmodels

How to change maxlag for ARMAX.predict?

Still in the process of understanding the ARIMA source code to forecast some data. (I use two time series (indexed_df and external_df with 365 data points each.)

I want to compare the forecast accuracy between ARMA and ARMAX.

The forecasting process for ARMA seems to work fine. But forecasting with one additional external variable does not work somehow:

Getting p and q values for ARMAX:

arma_mod1 = sm.tsa.ARMA(indexed_df, (2,0), external_df).fit()
y = arma_mod1.params
print 'P- and Q-Values(ARMAX):'
print y

Out:

P- and Q-Values(ARMAX):
const      34.739272
0           0.000136
ar.L1.0     0.578090
ar.L2.0     0.129253
dtype: float64

Getting an predicted value (in-sample):

start_pred = '2013-12-30'
end_pred = '2013-12-30'
period = (start_pred, end_pred)

predict_price1 = arma_mod1.predict(start_pred, end_pred, exog=True, dynamic=True) 
print ('Predicted Price (ARMAX): {}' .format(predict_price1))

Out:

Traceback (most recent call last):

  File "<ipython-input-102-78b3d705d411>", line 6, in <module>
    predict_price1 = arma_mod1.predict(start_pred, end_pred, exog=True, dynamic=True)

  File "/Applications/anaconda/lib/python2.7/site-packages/statsmodels-0.6.1-py2.7-macosx-10.5-x86_64.egg/statsmodels/base/wrapper.py", line 92, in wrapper
    return data.wrap_output(func(results, *args, **kwargs), how)

  File "/Applications/anaconda/lib/python2.7/site-packages/statsmodels-0.6.1-py2.7-macosx-10.5-x86_64.egg/statsmodels/tsa/arima_model.py", line 1441, in predict
    return self.model.predict(self.params, start, end, exog, dynamic)

  File "/Applications/anaconda/lib/python2.7/site-packages/statsmodels-0.6.1-py2.7-macosx-10.5-x86_64.egg/statsmodels/tsa/arima_model.py", line 736, in predict
    start, method)

  File "/Applications/anaconda/lib/python2.7/site-packages/statsmodels-0.6.1-py2.7-macosx-10.5-x86_64.egg/statsmodels/tsa/arima_model.py", line 327, in _arma_predict_out_of_sample
    exog)

  File "/Applications/anaconda/lib/python2.7/site-packages/statsmodels-0.6.1-py2.7-macosx-10.5-x86_64.egg/statsmodels/tsa/arima_model.py", line 293, in _get_predict_out_of_sample
    X = lagmat(np.dot(exog, exparams), p, original='in', trim='both')

  File "/Applications/anaconda/lib/python2.7/site-packages/statsmodels-0.6.1-py2.7-macosx-10.5-x86_64.egg/statsmodels/tsa/tsatools.py", line 328, in lagmat
    raise ValueError("maxlag should be < nobs")

ValueError: maxlag should be < nobs

My understanding of maxlag is that (if not defined before) the number of lags to be observed will be automatically calculated with:

maxlag = int(round(12*(nobs/100.)**(1/4.)

but i dont understand where i might change this calculation or set the number of maxlag.

My understanding of nobs is the number of time steps, i.e. values i have in my time series. (in my case 365).

So that means i need maxlag < 365, right?

Where can i define the number of maxlag?

The same error occurred in this question: ADF test in statsmodels in Python but i have no clue where to set maxlag for ARMAX prediction.

Help appreciated

Solution

The code:

predict_price1 = arma_mod1.predict(start_pred, end_pred, exog=True, dynamic=True) 
print ('Predicted Price (ARMAX): {}' .format(predict_price1))

has to be changed into:

predict_price1 = arma_mod1.predict(start_pred, end_pred, external_df, dynamic=True) 
print ('Predicted Price (ARMAX): {}' .format(predict_price1))

that way it works!

I compared the values without external_dfand they where different which can be seen as a proof i guess.