GeoPandas Max Attribute – How to Get Polygon with Maximum Attribute Within Same Geometry Using GeoPandas

geodataframegeometrygeopandasmaximumpython

I have a GeoDataframe containing the spatially joined result of a square grid map and flood hazard data. However, there are instances of rows with the same "geometry" but differing "flood_score" data (because of the spatial join intersection). How do I keep only the max "flood_score" data for each unique "geometry"?

enter image description here

I've tried the code below:

test = mrkna_grid.dissolve(by='flood_score', aggfunc='max')

However, it only returns 4 rows (as opposed to thousands) and grouped it by the "flood_score".

enter image description here

Essentially, I want to do this, but it doesn't work with "geometry":

df.loc[df.reset_index().groupby(['geometry'])['flood_score'].idxmax()]

Best Answer

Let's assume there is polygon layer (a shapefile) with the following attribute table, see image below

input

With this code I am loading this shapefile into a GeoDataFrame

import geopandas as gpd

file = "P:/Test/qgis_test/test_for_geopandas.shp"

gdf = gpd.read_file(file)
print(gdf)

The GeoDataFrame itself

    fid  ...                                           geometry
0   6.0  ...  POLYGON ((233499.352 5752838.208, 559980.331 5...
1   7.0  ...  POLYGON ((233499.352 5752838.208, 559980.331 5...
2   8.0  ...  POLYGON ((233499.352 5752838.208, 559980.331 5...
3   9.0  ...  POLYGON ((978501.160 5695530.377, 1164317.462 ...
4  10.0  ...  POLYGON ((978501.160 5695530.377, 1164317.462 ...
5  11.0  ...  POLYGON ((978501.160 5695530.377, 1164317.462 ...
6  12.0  ...  POLYGON ((978501.160 5695530.377, 1164317.462 ...
7  13.0  ...  POLYGON ((978501.160 5695530.377, 1164317.462 ...
8  14.0  ...  POLYGON ((978501.160 5695530.377, 1164317.462 ...
9  15.0  ...  POLYGON ((485306.490 4940108.963, 681542.397 5...

[10 rows x 4 columns]

With the following code, it is possible to keep only the max "flood_score" data for each unique geometry.

gdf_ = gdf.sort_values('flood_scor', ascending=False).drop_duplicates(['geometry'])
print(gdf_)

The output GeoDataFrame will look like:

    fid  ...                                           geometry
8  14.0  ...  POLYGON ((978501.160 5695530.377, 1164317.462 ...
1   7.0  ...  POLYGON ((233499.352 5752838.208, 559980.331 5...
9  15.0  ...  POLYGON ((485306.490 4940108.963, 681542.397 5...

[3 rows x 4 columns]