Search code examples
rlinear-regressionlmhypothesis-test

understanding durbin-watson test in R


I seem to misunderstand how the dwtest of the lmtest package in R works.

> r1 <- runif(1000, 0.0, 1.0)
> r2 <- runif(1000, 0.0, 1.0)
> dwtest(lm(r2 ~ r1 ))

    Durbin-Watson test

data:  lm(r2 ~ r1)
DW = 1.9806, p-value = 0.3789
alternative hypothesis: true autocorrelation is greater than 0

> 
> r1 <- seq(0, 1000, by=1)
> r2 <- seq(0, 1000, by=1)
> dwtest(lm(r2 ~ r1 ))

    Durbin-Watson test

data:  lm(r2 ~ r1)
DW = 2.2123, p-value = 0.8352

When i understand everything right, I first test 2 sets of random numbers with each other (which do not correlate - correct)

Then i correlate the numbers from 1 to 1000 incrementing with them self (which does not correlate - uhm... what)

Can someone point me to the obvious error i do?


Solution

  • Looking on Wikipedia, it seems like the Durbin-Watson test is for autocorrelation of residuals, not for correlation. So, if I define r2 <- r1 + sin(r1), then I get a significant result from the DW test:

    > r1 <- seq(0, 1000, by=1)
    > r2 <- r1 + sin(r1)
    > dwtest(lm(r2 ~ r1))
        Durbin-Watson test
    data:  lm(r2 ~ r1)
    DW = 0.91956, p-value < 2.2e-16
    alternative hypothesis: true autocorrelation is greater than 0
    

    Here's the reason. The value of r2[i], predicted from the linear model, is r1[i]. The "residual", which is the difference between the actual and predicted values, is r2[i]-r1[i]. If this is above zero, then r2[i+1]-r1[i+1] is probably also above zero, since they are neighboring values of the sine function. Therefore, there's "autocorrelation" in the residuals, meaning correlation between neighboring values.