[GIS] Spatial weight for PySAL from a geojson file or geodataframe

moran-indexpysalpython

I want to calculate the Moran I value of a geojson file through python. Recently I am trying to use pysal and I find that I need to get the a spatial weight to realize it. But from the official guide it seems that it only supports shapefiles. I am wondering whether I can get spatial weight from a geojson file without convert it to shapefile or if there are any others tools for calculating spatial autocorrelation with geojson file or geodataframe in python.

Best Answer

First, let us try to create a Minimal, Complete, and Verifiable example

geoj = {
  'type': 'FeatureCollection',
  'features':
  {
    'type': 'Feature',
    'geometry': {
      'type': 'Polygon',
      'coordinates': [
          [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0],
            [100.0, 1.0], [100.0, 0.0] ],
          [ [101.0, 0.0], [102.0, 0.0], [102.0, 1.0],
            [101.0, 1.0], [101.0, 0.0] ],
          [ [100.0, 1.0], [101.0, 1.0], [101.0, 2.0],
            [100.0, 2.0], [100.0, 1.0] ],
          [ [101.0, 1.0], [102.0, 1.0], [102.0, 2.0],
            [101.0, 2.0], [101.0, 1.0] ],
          ]
      }
    }
  }

One needs to make geoj be a pysal supported type, as follows

>>> import numpy as np
>>> import pysal as ps
>>> import shapely.geometry as shg

>>> coords= geoj["features"]["geometry"]["coordinates"]
>>> polys = [
       ps.cg.asShape(shg.Polygon(c))
       for c in coords
    ]
>>> points = [
        p.centroid
        for p in polys
    ]
>>> points_as_array = np.array(points) # I am not sure if this is mandatory in recent versions of pysal


To model binary correlation structure, you can use the function knnW_from_array

>>> wnn2 = ps.knnW_from_array(points_as_array, 2) # deprecated and removed soon (or later), and replaced by KNN.from_array
>>> wnn2.transform = 'r' # make wnn2 be row-stochastic
>>> wnn2.neighbors
{0: [2, 1], 1: [3, 0], 2: [3, 0], 3: [1, 2]}
>>> wnn2.weights
{0: [0.5, 0.5], 1: [0.5, 0.5], 2: [0.5, 0.5], 3: [0.5, 0.5]}

>>> wnn3 = ps.knnW_from_array(points_as_array, 3)
>>> wnn3.transform = 'r'
>>> wnn3.neighbors
{0: [2, 1, 3], 1: [0, 3, 2], 2: [0, 3, 1], 3: [2, 1, 0]}
>>> wnn3.full()[0]
array([[ 0.        ,  0.33333333,  0.33333333,  0.33333333],
       [ 0.33333333,  0.        ,  0.33333333,  0.33333333],
       [ 0.33333333,  0.33333333,  0.        ,  0.33333333],
       [ 0.33333333,  0.33333333,  0.33333333,  0.        ]])


Thanks to Sir Anselin and his modeling team, let us do what we want, modeling, say, continuous interactions of the type distance decay whose parameter is based on the well known gravity model: alpha=2.

One first has to turn points_as_array into an object of the type scipy.spatial.ckdtree.cKDTree

>>> kd = ps.common.KDTree(points_as_array)

retrieve distances as sparse matrix

>>> dmat = kd.sparse_distance_matrix(kd, max_distance=1e50)

As you can see, I choose such a big threshold that the matrix is very unlikely to be sparse actually. Now one initiates our neighbors and weights variables, mandatory to instantiate a pysal.weights.weights.W object, as follows

>>> ids       = np.arange(dmat.shape[0]) # If one has no ids to provide
>>> neighbors = dict([(i,[]) for i in ids])
>>> weights   = dict([(i,[]) for i in ids])

which can now be filled with neighborhoods and distance-decreasing weights

>>> for key,distance in dmat.items():           
    i,j = key
    if j not in neighbors[i]:
        weights[i].append(pow(distance,-alpha))
        neighbors[i].append(j)
    if i not in neighbors[j]:
        weights[j].append(pow(distance,-alpha))
        neighbors[j].append(i)
>>> wd2 = ps.weights.W(neighbors, weights)
>>> wd2.transform = 'r'
>>> wd2.full()[0]
array([[ 0. ,  0.4,  0.4,  0.2],
       [ 0.4,  0. ,  0.2,  0.4],
       [ 0.4,  0.2,  0. ,  0.4],
       [ 0.2,  0.4,  0.4,  0. ]])


Let us play with Moran's I

>>> y = np.array([21., 79., 49., 51.])
>>> mi_wnn2 = ps.Moran(y, wnn2, two_tailed=False)
>>> mi_wnn3 = ps.Moran(y, wnn3, two_tailed=False)
>>> mi_wd2  = ps.Moran(y, wd2, two_tailed=False)
>>> mi_wnn2.I, mi_wnn2.p_norm
(-0.46555819477434679, 0.32870362604749792)
>>> mi_wnn3.I, mi_wnn3.p_norm
(-0.33333333333333337, 0.49999999579646004)
>>> mi_wd2.I, mi_wd2.p_norm
(-0.38622327790973882, 0.3287036260474977)
Related Question