Geopandas – How to Perform Spatial Join Polygons Within Larger Polygon/Multipolygon

geopandaspython

I'm trying to spatially join smaller polygons that make up larger polygons/multipolygons using GeoPandas and the code below.

joined_gdf = gpd.sjoin(sua_2016_gdf, sa2_2021_gdf, how="left", predicate="intersects")

The sua_2016_gdf file is of my larger polygon boundaries and sa2_2021_gdf contains the smaller ones. The sa2 boundaries when combined will share a boundary with the sua ones.

I tried all the different predicate types (intersects, contains, within, touches, crosses, overlaps) but unfortunately the best I could manage was with intersects, which provided all the smaller polygons within, as well as all the smaller polygons touching the border outside, the larger polygon/multipolygon.

Is there a way I can strip out the smaller polygons that only touch the outside of the larger polygon's border so that only the polygons within remain?

Best Answer

You can buffer the small polygons by a small negative buffer distance so they fit inside the large polygon.

Or join by representative point.

import geopandas as gpd

smallpoly = gpd.read_file(r"C:\GIS\GIStest\smaller_polygon.shp")
largepoly = gpd.read_file(r"C:\GIS\GIStest\larger_polygon.shp")

sj1 = largepoly.sjoin(smallpoly, how="left", predicate="intersects")
#sj1.shape
#Out[19]: (45, 9)

#Buffer with a negative distance so the small polygons fit inside the large
smallpoly.geometry = smallpoly.buffer(-1)
sj2 = largepoly.sjoin(smallpoly, how="left", predicate="intersects")
# sj2.shape
# Out[24]: (26, 9)

#Or join by representative point.
smallpoly.geometry = smallpoly.geometry.representative_point()
sj3 = largepoly.sjoin(smallpoly, how="left", predicate="intersects")
#sj3.shape
#Out[30]: (26, 9)