Python Geopandas – How to Intersect Polygons Within a Single GeoDataFrame in Python

geopandasintersectionpythonshapely

I am trying to find a way to intersect a polygons layer with itself in python to identify each newly produced intersection polygon using geopandas, but am not sure how to actually perform this with a single geodataframe, as opposed to intersecting one geodataframe with another geodataframe.

I have this map of NYC boroughs. Each borough has a "borocode"

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import geopandas as gpd
from shapely.geometry import Polygon, LineString, Point

nybb_path = gpd.datasets.get_path('nybb')
boros = gpd.read_file(nybb_path)
boros.set_index('BoroCode', inplace=True)
boros.sort_index(inplace=True)
boros.plot()

We get this output geodataframe:
enter image description here

and see this:

enter image description here

I then draw buffers around them via:

boros['geometry'] = boros.geometry.buffer(8000)
boros.plot(cmap='Greens', edgecolor='black', alpha=0.5)

which produces this output dataframe, which looks the same, but the geometries are different:

enter image description here

and I then see this:

enter image description here

This leads to overlapping regions, as we can see above. There are 8 of these "overlapping" regions from what I can make out here.

What I want to do is identify the regions where these newly buffer polygons "overlap", or I suppose this would be called an intersection. I want to assign them a unique ID, and create a new column that shows which original polygons "participate" in creating each intersection zone.

And so, my goal is to create this geodataframe:

ID                   Overlaps     Number_of_Overlaps                                 geometry
----------------------------------------------------------------------------------------------
0                   Manhattan                      1     POLYGON ((XX.XX XX.XX, XX.XX XX...))
1                       Bronx                      1     POLYGON ((XX.XX XX.XX, XX.XX XX...))
2                    Brooklyn                      1     POLYGON ((XX.XX XX.XX, XX.XX XX...))
3                      Queens                      1     POLYGON ((XX.XX XX.XX, XX.XX XX...))
4               Staten Island                      1     POLYGON ((XX.XX XX.XX, XX.XX XX...))
5            Manhattan, Bronx                      2     POLYGON ((XX.XX XX.XX, XX.XX XX...))
6    Manhattan, Bronx, Queens                      3     POLYGON ((XX.XX XX.XX, XX.XX XX...))
7    Manhattan, Staten Island                      3     POLYGON ((XX.XX XX.XX, XX.XX XX...))
8           Manhattan, Queens                      2     POLYGON ((XX.XX XX.XX, XX.XX XX...))
9               Bronx, Queens                      2     POLYGON ((XX.XX XX.XX, XX.XX XX...))
10   Manhattan, Bronx, Queens                      3     POLYGON ((XX.XX XX.XX, XX.XX XX...))
11              Bronx, Queens                      2     POLYGON ((XX.XX XX.XX, XX.XX XX...))
12        Manhattan, Brooklyn                      2     POLYGON ((XX.XX XX.XX, XX.XX XX...))
...

The documentation on this in geopandas only includes examples where I am overlaying one "layer" on top of another "layer", whereas with my example I am trying to find these intersection zones within the same layer itself.

My inclination would be to go with this code using the .overlay() function within geopandas, going with:

NYC_intersections = boros.overlay(boros, how='intersection')

However, this function actually requires 2 geodataframes to be intersected, so I would need to intersect 2 different geodataframes, such as with:

res_intersection = boros1.overlay(boros2, how='intersection')

However, I only have my single geodataframe to work with, so I am not sure how to actually use this code.

My question ultimately here is, how can I take my buffered polygons layer and intersect it with itself to then create a geodataframe where each polygon, including the intersection zone polygons, is identified, listing the polygons that make up each "zone" and the number of polygons making up each zone?

Best Answer

You should be using union:

import geopandas as gpd

df = gpd.read_file(r'/home/bera/Desktop/GIStest/countries.shp')
df = df[["NAME","geometry"]]

dfbuff = gpd.GeoDataFrame(df, geometry=df.buffer(40000))

u = dfbuff.overlay(dfbuff, how="union")
u = u.loc[u["NAME_1"]!=u["NAME_2"]] #Drop the self unions

#The union will create duplicates, like polygon1-polygon2 and polygon2-polygon1
u["countries"] = u.apply(lambda x: '-'.join(sorted(x[["NAME_1", "NAME_2"]])), axis=1)
u = u.drop_duplicates(subset="countries") #Drop them by name

enter image description here