[GIS] Spatial join Python command to summarize attributes

arcgis-10.0pythonspatial-join

I am trying to run the Spatial Join function through the Python shell in ArcGIS 10. When you do the point-and-click method of spatial join, you are given a tool window with an option that asks "How do you want the attributes to be summarized?" However, the Python syntax doesn't show a parameter for this option.

How do I control the attribute summarization through Python? Why is there no parameter for this?

Best Answer

The optional field_mapping parameter is what you're looking for. The page you linked to links to "Mapping input fields to output fields", which will get you started. The mergeRule property of the FieldMap objects is the one that controls which type of statistic/summary to calculate.

Basically you have to create a single FieldMappings object, which is a collection of individual FieldMap objects, each with its own mergeRule.

1: without

# iterate through points 
 for i, pt in enumerate(points):
     point = shape(pt['geometry'])
     #iterate through polygons
     for j, poly in enumerate(polygons):
        if point.within(shape(poly['geometry'])):
             # sum of attributes values
             polygons[j]['properties']['score'] = polygons[j]['properties']['score'] + points[i]['properties']['score']

2: with a R-tree index (you can use pyrtree or rtree)

# Create the R-tree index and store the features in it (bounding box)
 from rtree import index
 idx = index.Index()
 for pos, poly in enumerate(polygons):
       idx.insert(pos, shape(poly['geometry']).bounds)

#iterate through points
for i,pt in enumerate(points):
  point = shape(pt['geometry'])
  # iterate through spatial index
  for j in idx.intersection(point.coords[0]):
      if point.within(shape(polygons[j]['geometry'])):
            polygons[j]['properties']['score'] = polygons[j]['properties']['score'] + points[i]['properties']['score']

Result with the two solutions:

for poly in polygons:
   print poly['properties']    
 OrderedDict([(u'score', 2)]) # 2 points in the polygon
 OrderedDict([(u'score', 1)]) # 1 point in the polygon
 OrderedDict([(u'score', 1)]) # 1 point in the polygon

What is the difference ?

Without the index, you must iterate through all the geometries (polygons and points).
With a bounding spatial index (Spatial Index RTree), you iterate only through the geometries which have a chance to intersect with your current geometry ('filter' which can save a considerable amount of calculations and time...)
but a Spatial Index is not a magic wand. When a very large part of the dataset has to be retrieved, a Spatial Index cannot give any speed benefit.

After:

schema = fiona.open('poly.shp').schema
with fiona.open ('output.shp', 'w', 'ESRI Shapefile', schema) as output:
    for poly in polygons:
        output.write(poly)

To go further, look at Using Rtree Spatial Indexing With OGR, Shapely, Fiona

[GIS] Spatial Join of Polygons to Polygon Grid with Shared Boundaries in ArcGIS Desktop

@FelixIP answered this in the comments...

The spatial join is unnecessary when there is a summary statistics function (although the spatial join with 'have their centres' in as the match option should work, it did not for me).

This is useful when there are large polygons with a stat, say, a population estimate. You need to display this data on a common grid (vector). Sometimes two polygons or more intersect with a single grid poly. The following can be used:

First scale your statistic - calculate population per unit area and add to attribute table
Union polygons with the poly grid (not raster)
Calculate the new population stat for each unioned poly - use the pop per unit area multiplied by the area of each unioned poly
Summary statistics with the Gridcell ID as unique summary field
Join output of (4) back to the poly grid using the Gridcell ID as unique case field

Best Answer

Related Solutions

[GIS] More Efficient Spatial join in Python without QGIS, ArcGIS, PostGIS, etc

1: without

2: with a R-tree index (you can use pyrtree or rtree)

[GIS] Spatial Join of Polygons to Polygon Grid with Shared Boundaries in ArcGIS Desktop

Related Question