[GIS] Moran’s I test on regression residuals with spatial autocorrelation

spatial statisticsspatial-analyst

Suppose we want a regression model $y_i = f(x_i^t) + e_i$, where $y_i$ and $x_i$ are collected spatially, and we believe there would be spatial autocorrelation (SAC). I read some popular methods like spatial-lag or spatial-error, and the Moran's I test.

My question is, if we believe that the SAC in $y_i$ is all caused by SAC $x_i$, that is, we believe $e_i$ are independent with each other. Then is there any problem to simply run a normal regression model? It looks to me that this way does not violate any assumptions. I can definitely run a test on the residuals. However several application papers do things in such a way: they first run Moran's I on $y_i$, if significant, then spatial-lag. O.w., run Moran's I on residuals, if significant, then spatial-error. If looks weird to me.

Best Answer

Running a regression on data that is spatially autocorrelated is fine, and unavoidable in most scenarios (e.g. ecological modelling).

It is when you have SAC in your residuals that you have issues. The assumptions of independence are not met and the chance of Type 1 error is increased. Not to mention potential for unstable/biased parameter estimates.

So it's important test your residuals for SAC. You could use Moran's I, computed at various lags (e. g. Corellograms), variograms and local estimates of SAC. Probably good to try a range of methods to get a better picture of the error structure.