[GIS] Modifying Bivariate Moran’s I /LISA to include self

geodaspatial statistics

I am using GeoDa's Bivariate Moran's I function to examine patterns of
collocation between two variables at the county level (call them A and
B for now).

I know that, having run the analysis I can say things like "Counties
with a high value for A tend to have neighbors with a high value of B"

What I would rather say is that "Counties with a high value of A tend
to locate in regions with a high value of B" The difference being that
the latter would seem to include the county where A is high as well as
its neighbors, while the former includes only the neighbors.

I can see my way towards changing this–simply making every county its
own neighbor in the weights matrix, but I was wondering what kind of
havoc this would play on my values for I and the rest of the
diagnostics.

PS. If you happen to stumble on this identical question posted by me here:
https://groups.google.com/forum/?fromgroups#!topic/openspace-list/WUL1kQkenWo

please note that the answer I received was incorrect as was carefully pointed out by a rather irate reviewer.

Best Answer

The matrix diagonal represents the self-potential. Normally, when you solve for Moran's-I you remove the diagonal of the matrix. However, I cannot find anything in the GeoDa documentation that explains the default behavior of the statistic. I have never seen options for deriving interzonal weights so, I would imagine that, since it is 0, the diagonal is removed. You may need to contact the authors for a definitive answer.

I know that the ArcGIS implementation has an option for including self-potential in the univariate case (sorry, no bivariate implementation). However, they do not have a correct test statistic and the z-value and p-values are not stable when faced with non-normal distributions.

Likely the only way you will be able to implement this is to code it yourself in something like PySal or R. The univariate interzonal weight is calculated as: dij = 0.5 * [(Aij / π)**0.5] but you will have to figure out the bivariate adjustment. If you implement this you will have to think long and hard about the meaning. I am not sure that it is going to provide you with what your are thinking.

You could consider using a scan statistic that is better suited for spatial time-series analysis under specific distributional assumptions. This would provide you with a more suitable framework for hypothesis testing. I would also look into Crimestat. As I recall there is some flexibility in defining the behavior of contingency and there is a bivariate Moran's-I/LISA.

You may also consider an autoregressive model (Li et al., 2007). This method re-scales the measure by a function of the eigenvalues of the spatial weights matrix and provides a much more robust measure of the spatial dependence.

Li, H., C.A. Calder and N. Cressie. (2007). Beyond Moran’s I: Testing for spatial   
  dependence based on the spatial autoregressive model. Geographical Analysis 
  39:357–375. 

Sorry for not being able to provide a more definitive answer.

Related Question