I'm new to working with geo data and trying to address a similar question to the one posted as How to calculate the size of a particular area below a buffer in QGIS
Using command line and/or shapely/fiona, in the most efficient and elegant way.
I created two shapefiles with their respective numerical attributes:
- small areas with data on population
- police districts with crime stats(single number)
They are independent.
I'd like to recalculate those such that both estimates are present in each of the shapefiles, eg. estimate of crime within small area (take data from 2 and look for all areas in 1 that are contained in it, cut those that intersect and do maths).
I have the maths figured out and stuck looking for elegant ways and efficient data formats to do intersections and cut out the overlaps to appropriate boundaries.
Update following the response of @gene
The adapted code does not finding any intersections in the data, which is structured as follows (and surprising since it's from official sources):
- polSHP_upd: police precincts across SA
- sal_subSHP: subset of small regions with population data in Cape Town
The files live here:
https://github.com/AnnaMag/GIS-analyses
What am I missing?
polSHP_upd = 'crime/GIS-analyses/polPres_updated4crimes/polPres_updated4crimes.shp'
sal_subSHP = 'crime/GIS-analyses/sal_sub/sal_sub.shp'
import geopandas as gpd
g1 = gpd.GeoDataFrame.from_file(polSHP_upd)
g2 = gpd.GeoDataFrame.from_file(sal_subSHP)
data = []
# A: region of the police data with criminal record
# C: small area with population data
# we look for all small areas intersecting a given C_i, calculate the fraction of inclusion, scale the
# population accordingly: area(A_j, where A_j crosses C_i)/area(A_j)* popul(A_j)
for index, crim in g1.iterrows():
for index2, popu in g2.iterrows():
if popu['geometry'].crosses(crim['geometry']): # objects overlaps to partial extent, not contained
area_int = popu['geometry'].intersection(crim['geometry']).area
area_crim = crim['geometry'].area
area_popu = popu['geometry'].area # there is a Shape_Area field in properties: to check precision
popu_count = popu['properties']['PPL_CNT']
popu_frac = (area_int / area_popu) * popu_count# fraction of the pop area contained inside the crim
data.append({'id': (index1, index2) ,'area_crim': area_crim,'area_pop': area_popu, 'area_inter': area_int, 'popu_frac': popu_frac} )
# data.append({'geometry': crim['geometry'].intersection(popu['geometry']), 'crime_stat':crim['crime_stat'], 'Population': popu['Population'], 'area':crim['geometry'].intersection(popu['geometry']).area})
elif popu['geometry'].intersects(crim['geometry']):
pass
#print("intersect")
elif popu['geometry'].disjoint(crim['geometry']):
pass`
Best Answer
The question is about Shapely and Fiona in pure Python without QGIS ("using command line and/or shapely/fiona").
A solution is
The original two layers and the resulting layer
Part of the resulting layer table
You can use a spatial index (rtree here, look at GSE: Fastest way to join many points to many polygons in python and Using Rtree spatial indexing with OGR)
Another solution is to use GeoPandas (= Pandas + Fiona + Shapely)
Update
You need to understand the definition of the spatial predicates. I use here the JTS Topology suite
As you can see there are only intersections and no crosses nor disjoint here. Some definitions from the Shapely manual
You can control it by a simple script (there are other solution but this one is the simplest)
and the result is 0
Therefore, you only need intersects here.
Your script becomes
Result: