Correlated residuals in time series

I use "vars" R package to do a multivariate time series analysis. The thing is when I conduct a bivariate VAR, the result of serial.test() give always a really low p-value, so we reject H0 and the residuals are correlated. The right thing to do is to increase the order of the VAR but even with a very high order (p=20 or even more) my residuals are still correlated. How is it possible ?

I can't really give you a reproducible code because i don't know how to reproduce a VAR with residuals always correlated. For me it's a really unusual situation, but if someone know how it's possible it would be great.

Solution

This is probably a better question for Cross Validated as it doesn't contain any R code or a reproducible example, but you're probably going to need to do more digging than "I have a low p-value". Have you tested your data for normallity? Also, to say

The right thing to do is to increase the order of the VAR

is very inaccurate. What type of data are you working with that you would set a lag order as high as 20? A Typical value for yearly data is 1, for quarterly is 4, and for monthly is 12. You can't just keep throwing higher and higher orders at your problem and expect it to fix issues in the underlying data.

Assuming you have an optimal lag value and your data is normally distributed and you still have a low p-value there are several ways to go.

Minor cases of positive serial correlation (say, lag-1 residual autocorrelation in the range 0.2 to 0.4, or a Durbin-Watson statistic between 1.2 and 1.6) indicate that there is some room for fine-tuning in the model. Consider adding lags of the dependent variable and/or lags of some of the independent variables. Or, if you have an ARIMA+regressor procedure available in your statistical software, try adding an AR(1) or MA(1) term to the regression model. An AR(1) term adds a lag of the dependent variable to the forecasting equation, whereas an MA(1) term adds a lag of the forecast error. If there is significant correlation at lag 2, then a 2nd-order lag may be appropriate.

If there is significant negative correlation in the residuals (lag-1 autocorrelation more negative than -0.3 or DW stat greater than 2.6), watch out for the possibility that you may have overdifferenced some of your variables. Differencing tends to drive autocorrelations in the negative direction, and too much differencing may lead to artificial patterns of negative correlation that lagged variables cannot correct for.

If there is significant correlation at the seasonal period (e.g. at lag 4 for quarterly data or lag 12 for monthly data), this indicates that seasonality has not been properly accounted for in the model. Seasonality can be handled in a regression model in one of the following ways: (i) seasonally adjust the variables (if they are not already seasonally adjusted), or (ii) use seasonal lags and/or seasonally differenced variables (caution: be careful not to overdifference!), or (iii) add seasonal dummy variables to the model (i.e., indicator variables for different seasons of the year, such as MONTH=1 or QUARTER=2, etc.) The dummy-variable approach enables additive seasonal adjustment to be performed as part of the regression model: a different additive constant can be estimated for each season of the year. If the dependent variable has been logged, the seasonal adjustment is multiplicative. (Something else to watch out for: it is possible that although your dependent variable is already seasonally adjusted, some of your independent variables may not be, causing their seasonal patterns to leak into the forecasts.)

Major cases of serial correlation (a Durbin-Watson statistic well below 1.0, autocorrelations well above 0.5) usually indicate a fundamental structural problem in the model. You may wish to reconsider the transformations (if any) that have been applied to the dependent and independent variables. It may help to stationarize all variables through appropriate combinations of differencing, logging, and/or deflating.