Dissolving/merging overlapping polygons in shapefile but retaining a list of original attribute values in Python/GeoPandas

dissolvegeopandasoverlaypython

I am looking for an answer to this question, for overlapping polygons, but in Python and preferably with GeoPandas.

Background

I have files with up to 90'000 polygons, see Tab. 1 in the Minial Working Example (MWE) attached. These polygons sometimes overlap to a large extent (see Fig. 1 in MWE attached, e.g. the polygons "One", "Two", and "Three" all overlap with each other.)

Aim

I would like to to get Tab. 2 in the MWE attached, i.e. merging/dissolving the polygons within one GeoDataFrame and keeping all polygon IDs that were merged/dissolved to create the new shapes. Shapes that do not overlap with anything should be preserved (see "No Overlap" in Tab. 2 in the MWE attached).

What I have tried

  1. '''geopandas.overlay(gdf1, gdf1, 'how'='union')''' -> does not seem to give back merged shapes, but rather intersections, whether I set the 'how' to 'union' or 'intersection' (see here for details)
  2. Using this idea with Shapely. Seems tedious to me and I would prefer a Geopandas approach if possible. Plus I also did not find a way to do ID preservation
  3. Using this idea I retrieve the right shapes, but loose as my Polygon IDs, respectively only keep the first one

I am sure there must be an easy pythonic way, but I think I lack the terminology to find the solution.

Minimal Working Example (MWE)

Best Answer

You can groupby and specify an aggregate function for each field you want to keep.

I have four fields,

LAN_KOD (my group by field)

KOMMUNKOD which I want as a list/comma separated string

SomeFloat which should be summed

geometry which should be dissolved:

 import geopandas as gpd

df = gpd.read_file(r'C:\GIS\data\testdata\ak_riks_2.shp')

aggfunctions = {'KOMMUNKOD':lambda x: ','.join(str(y) for y in list(x)), #With just list as function, it would not export as a shapefile. I needed to convert it to string
                'SomeFloat':'sum', 
                'geometry': lambda x: x.unary_union}

df2 = df.groupby('LAN_KOD').agg(aggfunctions)
df3 = gpd.GeoDataFrame(df2, geometry='geometry') #df2 became a pandas dataframe by the grouping
df3.to_file(r'C:\GIS\data\testdata\ak_riks_3.shp')

enter image description here