Regression Analysis – Understanding the Durbin Watson Test for Detecting Autocorrelation in Time Series

autocorrelationdurbin-watson-testregressiontime series

The test statistic for the Durbin Watson test can range from 0-4 from what I have gathered. Now the lower limit of 0 makes sense considering the test statistic consists of two summations which are both squared and divided by each other; but what gives us our upper limit of 4? Is this limit incorrect? Also, if the upper limit to this test statistic is not 4, than what is it?

Best Answer

The limits are both correct.

The DW statistic is, for $e_t$ the residuals of an appropriate regression, \begin{align*} DW&=\frac{\sum_{t=2}^T(e_t-e_{t-1})^2}{\sum_{t=1}^Te_t^2}\\ &=\frac{\sum_{t=2}^T(e_t^2+e_{t-1}^2-2e_{t}e_{t-1})}{\sum_{t=1}^Te_t^2}\\ &=\frac{\sum_{t=2}^Te_t^2}{\sum_{t=1}^Te_t^2}+\frac{\sum_{t=2}^Te_{t-1}^2}{{\sum_{t=1}^Te_t^2}}-2\frac{\sum_{t=2}^Te_{t}e_{t-1}}{\sum_{t=1}^Te_t^2} \end{align*} The first two fractions are obviously between 0 and 1 (both entries are positive, and we sum more positive terms in the denominator). In fact, they will almost always be very close to 1, as the numerators only differ from the denominator by $e_1^2$ and $e_T^2$, respectively, which, for $T$ reasonably large, will be negligible.

The third one can be bound to be between -1 and 1 by the Cauchy-Schwarz inequality: \begin{align*} \left(\sum_{t=2}^Te_{t}e_{t-1}\right)^2&\leq\sum_{t=2}^Te_{t}^2\sum_{t=2}^Te_{t-1}^2\\ &\leq\sum_{t=1}^Te_{t}^2\sum_{t=1}^Te_{t}^2=\left(\sum_{t=1}^Te_{t}^2\right)^2, \end{align*} so that $$ -\sum_{t=1}^Te_{t}^2\leq\sum_{t=2}^Te_{t}e_{t-1}\leq\sum_{t=1}^Te_{t}^2 $$ Somewhat less rigorously, but more intuitively: We have that the DW statistic can approximatively be written as $$ DW\approx 2(1-\hat\rho),$$ where $\hat\rho$ is the estimated $AR(1)$ coefficient. For $\hat\rho\to\pm1$, we see that the statistic tends to the bounds.

That this is not rigorous follows from the fact that $|\hat\rho|$ can be bigger than one. For large $T$ that should not happen very often when the true $\rho$ is less than one in absolute value, as it is required to be by assumption. But it can happen:

library(lmtest)
library(car)

set.seed(4)
T <- 50
u <- rep(0,T)
rho <- -0.999

for (i in 2:T) {
  u[i] <- rho*u[i-1]+rnorm(1)
}

regr <- lm(u~1)
dwtest(regr)
durbinWatsonTest(regr)

# by hand:
uhat <- regr$residuals

rhohat <- lm(uhat[2:T]~uhat[1:(T-1)]-1)$coef[1]
(DWstat <- sum(diff(uhat)^2)/sum(uhat^2))
[1] 3.799259
(ApproxDWStat <- 2*(1-rhohat))
uhat[1:(T - 1)] 
       4.044021

Best Answer

Related Solutions

Time Series – How to Perform Pooled Cross-Sectional Analysis

Related Question