Solved – Durbin-Watson test and biological (non time-series) data

autocorrelationdurbin-watson-testhypothesis testingrregression

I was experimenting with cabbages data set and linear regression in R. I tried a Durbin-Watson test on model "Vitamin C concentration as function of cabbage head weight" and got significant result of autocorrelation:

data(cabbages, package = "MASS")
lmtest::dwtest(VitC ~ HeadWt, alternative="two.sided", data=cabbages)

Result:

Durbin-Watson test

data:  VitC ~ HeadWt
DW = 1.2929, p-value = 0.003546
alternative hypothesis: true autocorrelation is not 0
  1. How should I interpret this result of significant autocorrelation
    in this context?
  2. Does it mean that linear regression is not
    suitable for this data set? If yes, what are the alternatives?
  3. Is Durbin-Watson test appropriate in this case, as it is not
    time-series?

I read several post on Durbin-Watson test (e.g., 1, 2, 3). I noticed, that usually it is mentioned in context of econometrics ant time series analysis but do not clearly understand in what situations it is appropriate to use this test and in what situations it is not.

Best Answer

(1) There is some correlation in the ordering of the observations. In this case, (part of) the reason is that the observations are ordered by Cult (a factor indicating the cultivator of the cabbages). And because the first cultivator is mostly associated with negative residuals and the second cultivator mostly with positive residuals, this pattern will be picked up by diagnostic tests. It might look like a "trend" or like "autocorrelation" if this is all the tests look for.

(2) Linear regression itself seems to work ok. But it is important to control for Cult and not only for HeadWt. Possibly Date could be relevant as well. It would also be good to check what the MASS book says about the data (my copy is in the office, hence I can't check right now).

(3) No. The Durbin-Watson is appropriate if you have correlations over "time" or some other kind of natural ordering of the observations. And even then there might be other autocorrelation tests that could be more suitable.

Related Question