Python PySAL – Interpretation of Global Moran I Values

autocorrelationmoran-indexpysalpython

I'm using pysal library (and this specific notebook) to run spatial autocorrelation.
I have question regard the MORAN I value that I recieve.

As far as I understand, in order to be able to say something about the spatial ditribution (if the distiburion is dispersed, clusterd…)in given area, I cannot only use the raw MORAN I value, but I have to get the z-score and only then I can understand the global spatial distribution.

However, using pysal, I recieve one value, and I am not quite sure from the documentation if I can tell something from this value, for example here, I recieve 0.79 :

Is there any way to tell something about the spatial pattern using this MORAN I value?or any other pysal function to get the z-score ? or any way outside pysal?

Best Answer

To answer the last question, I generally use GeoDA software to calculate the Moran's I (and by extension the local indicators of spatial autocorrelation [LISA]). This software does calculate a significance value for Moran's I.

On the question of the z-score, I am not an expert, but it seems to me that it is an indicator that makes it possible to evaluate the significance of a score, like the Moran's I. But I don't think it is worth using the z-score because the function also returns a p-value (well, it is an estimate made from a number of random permutations, hence the existence of a parameter permutations in the signature of the function).

However, if you really need the z-score, it is specified in the documentation that the esda.Moran function does return its value. You should be able to access it like this :

moran = Moran(y, w)

# z-value of I under normality assumption
moran.z_norm

# (if permutations>0) standardized I based on permutations
moran.z_sim

# z-value of I under randomization assumption
moran.z_rand

The comments come from the documentation, unfortunately there are no more details about the meaning of these different values. I suppose that z_norm is to be used if the variable follows a normal distribution, and z_rand if it is not the case. As for z_sim, it is a value simulated by permutations, so I suppose it is this value that should be used when in doubt, but I am really not sure.

Related Solutions

[GIS] Interpreting Moran’s I results

The expected value of Moran's I is -1/(N-1), which for your sample of 38 cases equals -1/(38-1) = -0.02702703. This is what the software spit out, so that is a good start! So this means that there is really no evidence of negative auto-correlation here, as with random data you would expect it to be a negative value more often than positive.

You interpret the hypothesis test the same way you do any others. That is, you fail to reject the null hypothesis that there is no spatial auto-correlation in the values of year2009 for this sample.

Your spatial weights matrix code looks fine to me to estimate an inverse distance matrix. The biggest thing to look out for when using inverse distances are very short distances, which can make the weights explode. Spatial weights are often arbitrary though, so it is often domain knowledge that helps you choose whether to use inverse distance, or contiguity, or nearest neighbor, etc. type of a spatial weights matrix. So it appears the code to estimate the inverse distance weighted matrix is fine, but I can't say if it is the correct type of spatial weights matrix to use for your situation.

[GIS] Spatial weight for PySAL from a geojson file or geodataframe

First, let us try to create a Minimal, Complete, and Verifiable example

geoj = {
  'type': 'FeatureCollection',
  'features':
  {
    'type': 'Feature',
    'geometry': {
      'type': 'Polygon',
      'coordinates': [
          [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0],
            [100.0, 1.0], [100.0, 0.0] ],
          [ [101.0, 0.0], [102.0, 0.0], [102.0, 1.0],
            [101.0, 1.0], [101.0, 0.0] ],
          [ [100.0, 1.0], [101.0, 1.0], [101.0, 2.0],
            [100.0, 2.0], [100.0, 1.0] ],
          [ [101.0, 1.0], [102.0, 1.0], [102.0, 2.0],
            [101.0, 2.0], [101.0, 1.0] ],
          ]
      }
    }
  }

One needs to make geoj be a pysal supported type, as follows

>>> import numpy as np
>>> import pysal as ps
>>> import shapely.geometry as shg

>>> coords= geoj["features"]["geometry"]["coordinates"]
>>> polys = [
       ps.cg.asShape(shg.Polygon(c))
       for c in coords
    ]
>>> points = [
        p.centroid
        for p in polys
    ]
>>> points_as_array = np.array(points) # I am not sure if this is mandatory in recent versions of pysal

To model binary correlation structure, you can use the function knnW_from_array

>>> wnn2 = ps.knnW_from_array(points_as_array, 2) # deprecated and removed soon (or later), and replaced by KNN.from_array
>>> wnn2.transform = 'r' # make wnn2 be row-stochastic
>>> wnn2.neighbors
{0: [2, 1], 1: [3, 0], 2: [3, 0], 3: [1, 2]}
>>> wnn2.weights
{0: [0.5, 0.5], 1: [0.5, 0.5], 2: [0.5, 0.5], 3: [0.5, 0.5]}

>>> wnn3 = ps.knnW_from_array(points_as_array, 3)
>>> wnn3.transform = 'r'
>>> wnn3.neighbors
{0: [2, 1, 3], 1: [0, 3, 2], 2: [0, 3, 1], 3: [2, 1, 0]}
>>> wnn3.full()[0]
array([[ 0.        ,  0.33333333,  0.33333333,  0.33333333],
       [ 0.33333333,  0.        ,  0.33333333,  0.33333333],
       [ 0.33333333,  0.33333333,  0.        ,  0.33333333],
       [ 0.33333333,  0.33333333,  0.33333333,  0.        ]])

Thanks to Sir Anselin and his modeling team, let us do what we want, modeling, say, continuous interactions of the type distance decay whose parameter is based on the well known gravity model: alpha=2.

One first has to turn points_as_array into an object of the type scipy.spatial.ckdtree.cKDTree

>>> kd = ps.common.KDTree(points_as_array)

retrieve distances as sparse matrix

>>> dmat = kd.sparse_distance_matrix(kd, max_distance=1e50)

As you can see, I choose such a big threshold that the matrix is very unlikely to be sparse actually. Now one initiates our neighbors and weights variables, mandatory to instantiate a pysal.weights.weights.W object, as follows

>>> ids       = np.arange(dmat.shape[0]) # If one has no ids to provide
>>> neighbors = dict([(i,[]) for i in ids])
>>> weights   = dict([(i,[]) for i in ids])

which can now be filled with neighborhoods and distance-decreasing weights

>>> for key,distance in dmat.items():           
    i,j = key
    if j not in neighbors[i]:
        weights[i].append(pow(distance,-alpha))
        neighbors[i].append(j)
    if i not in neighbors[j]:
        weights[j].append(pow(distance,-alpha))
        neighbors[j].append(i)
>>> wd2 = ps.weights.W(neighbors, weights)
>>> wd2.transform = 'r'
>>> wd2.full()[0]
array([[ 0. ,  0.4,  0.4,  0.2],
       [ 0.4,  0. ,  0.2,  0.4],
       [ 0.4,  0.2,  0. ,  0.4],
       [ 0.2,  0.4,  0.4,  0. ]])

Let us play with Moran's I

>>> y = np.array([21., 79., 49., 51.])
>>> mi_wnn2 = ps.Moran(y, wnn2, two_tailed=False)
>>> mi_wnn3 = ps.Moran(y, wnn3, two_tailed=False)
>>> mi_wd2  = ps.Moran(y, wd2, two_tailed=False)
>>> mi_wnn2.I, mi_wnn2.p_norm
(-0.46555819477434679, 0.32870362604749792)
>>> mi_wnn3.I, mi_wnn3.p_norm
(-0.33333333333333337, 0.49999999579646004)
>>> mi_wd2.I, mi_wd2.p_norm
(-0.38622327790973882, 0.3287036260474977)

Best Answer

Related Solutions

[GIS] Interpreting Moran’s I results

[GIS] Spatial weight for PySAL from a geojson file or geodataframe

Related Question