Solved – Choice of weight function in Moran’s I

autocorrelationclusteringscale-invariancespatial

I'm doing an autocorrelation analysis for a spatially distributed collection of observations. To perform my analysis, I am using Moran's I statistic.

My questions are: (1) What are the implications and benefits of using different weighting functions, i.e. $d^{-1}$, $d^{-2}$, $\exp(-d)$, and (2) Is there any (perhaps informal) answer to which of the possible weighting functions is used most frequently in the geo-statistics literature (and for what purposes)?

As for why I care: I am trying to explore whether there is clustering in my data set at different scales of structure, following some of the methodology of Fauchald 2000. I am plotting Moran's I versus aggregation scale. The interesting thing that the resulting correlation curves show very different qualitative behavior when calculating using $d^{-1}$ and $d^{-2}$ weighting functions ($d^{-1}$ has a discontinuity point, for example). I'm having a hard time understanding why this would be true — does anyone have experience with this who may be able to point me to some references?

Best Answer

Moran's I statistic is used to explore a specific type of spatial clustering: whether high values are located in proximity to other high values and whether low values are located in proximity to other low values.

The trick then is 1st to get a sense of what you mean by proximity and 2nd formulating this mathematically. This idea of proximity will depend on the what type of observations (attributes) you are working with and what type of questions you have in mind.

For example, for human beings proximity could mean the distance needed to have a chat. So, if you wanted to know whether high income people like to chat with other high income people at your cocktail party, you could formulate proximity by using binary weights where 1 is defined by 2 people being within 3 feet of each other. To see whether house prices are spatially correlated you could define proximity as when 2 houses are neighbors or perhaps if two houses are on the same block or if 2 houses are within sight of one another etc etc.

Basically, you need a hypothesis of proximity that is based on some of your prior common sense ideas or expert knowledge of why 2 objects that are close to one another are more associated than 2 objects that are far from one another.

Moran's I can then be seen as a test of your hypothesis of how your notion of proximity structures high values next to one another on the landscape.