Use focal statistics instead of block statistics: when using rectangular neighborhoods this produces the same results in the centers of the blocks, but focal stats are computed with moving (overlapping) windows, effectively creating a representation of a surface of relative slopes. Moreover, focal stats can be computed with more natural neighborhoods, such as circles, rather than the artificial isothetic rectangles required of block stats.
ArcGIS's focal statistics command will compute a range statistic: that's the numerator. To compensate for edge effects, it's often wise not to assume the moving windows have constant area (because their areas diminish towards the edges). Instead, create a unit grid (using Not(IsNull(...))
for instance) and compute the focal sum using the same neighborhood specification. Multiplying the focal sum by the squared cellsize (in squared kilometers, if that's what you need) gives the "focal area" grid. This calculation compensates not only for edge effects but also for any missing (NoData
) values within the grid itself. This is the denominator: divide the focal range by this to obtain the relative slope.
If, at the end, you really need non-overlapping neighborhoods (although I can see no essential reason why that's necessary), you can subsample the resulting grid along any array of points you like, such as at the centers of the originally intended blocks. Perhaps surprisingly, this might not be any less efficient than carrying out block statistics directly: the time to carry out (simple) grid operations tends to be limited more by overhead like data I/O than by the calculations themselves. Having a full grid of results can, at the least, provide a richer and more informative map of the calculation than the block statistics grid does.
One does have to question the why here. What do you hope to achieve in evaluating multivariate autocorrelation? What hypothesis are you, in fact, testing? If this is in the context of a linear multivariate model then there is no real insight to be gained in evaluating the autocorrelation structure of the entire design matrix. You want to draw inference at the parameter scale, which may include autocorrelation. In partialling out the effects of autocorrelation, to meet model assumptions or derive a term for a mixed effects model, this is totally unnecessary.
You cannot easily expand autocorrelation statistics into a multivariate space. Holgersson (2004) provides some insights on multivariate autocorrelation. I believe that there has been some use of Monte Carlo approaches on residuals from seemingly unrelated regression (SUR) models to evaluate multivariate autocorrelation. Smouse & Peakall (1999) proposed a permutation method to evaluate multivariate autocorrelation on multiloci genetic data but, I have never found the test to be tractable on non-genetic data. If you would like to take a model based approach to understanding spatial structure and drawing inference from the spatial process, in a multivariate space, then I would recommend exploring Principal Coordinates of Neighbour Matrices (Borcard et al., 2004).
There are some cross-correlation statistics (eg., Anselin 1995; Chen 2015), that can be solved using some elegant matrix algebra, allowing for bivariate evaluation of spatial autocorrelation but, no multivariate extensions. You can perform a series of pairwise comparisons using cross-correlations or just stick to evaluating one-by-one univariate autocorrelation. The rub with cross-correlation approaches is that they can sometimes be difficult to interpret and can be influenced by latent processes.
Multivariate spaces are tricky because the parameters may scale differently and the state-space can easily explode, making the problem intractable. This is why models that can handle complex high-dimensional space are so popular these days. This is not an autocorrelation approach per se but, you could evaluate the structural characteristics resulting from a clustering approach such as K-means, fit with an optimal k. You could even derive a spatial lag(s) for your variables and use them in lieu of the original parameters. Although, this would be computationally expensive and likely would not buy you much more that just using the unaltered parameters.
References
Anselin, L. (1995) Local indicators of spatial association, Geographical Analysis, 27:93-115
Borcard D, Legendre P, Avois-Jacquet C, Tuomisto H (2004) Dissecting the spatial structures of ecological data alt all scales. Ecology. 85(7):1826-1832.
Chen., Y. (2015) A New Methodology of Spatial Cross-Correlation Analysis. PLoS One 10(5):e0126158. doi:10.1371/journal.pone.0126158
Holgersson, H.E.T. (2004) Testing for Multivariate Autocorrelation, Journal of Applied Statistics, 31:4, 379-395, DOI: 10.1080/02664760410001681693
Smouse, P. E. and Peakall, R. (1999) Spatial autocorrelation analysis of individual multiallele and multilocus genetic structure. Heredity, 82, 561–573.
Best Answer
The source of confusion may be in the two W terms in your equation--this may suggest different types of weights in the formula. It will therefore be helpful to re-express the Moran’s I equation in a different form as follows:
Using the dataset you reference in your post as an example, yi is the income for a polygon of interest, i, and yj is the income for all other polygons, j, in the dataset. ybar is the mean income values for all polygons in the dataset. The weight wij is
0
if polygon j is not a neighbor of polygon i. If polygon j is a neighbor of polygon i, then the weight wij takes on a non-zero value. This non-zero value can vary depending on how a neighbor’s weight is defined. n is the total number of polygons.To understand how the weights are used in the formula, we’ll work with the example you share. The map of the 16 polygons along with the income values are shown in the following figure.
The first step in the workflow is to conceptualize the idea of a neighbor. This can be contiguous polygons, distance between centroids, kth neighbors, etc…
The next step is to compute the neighborhood matrix. Here, all polygons are represented in both the rows and columns. The ith polygons are listed along the rows, the jth polygons along the columns. The intersection of the two is the weight the polygon in the jth column contributes to the neighboring polygons of the ith row. The weight values can be computed in many different ways. For example, the values can be binary (1 for neighboring polygons and 0 otherwise), or fractions where the sum of each neighboring polygon fraction sums to one—-this gives us the mean value of the neighboring polygons, for example. The figure below adopts the latter matrix computation.
Once the weight matrix is computed, its weight values are used to compute the Moran’s I value. The weights are used in both the numerator and the denominator as shown in the following figure.