Fill Holes in Polygon Shapefile with GeoPandas – Area Threshold

geopandaspython 3.7

I have a polygon shapefile in which I want to fill holes in the polygons that are smaller than a certain threshold (area_min_hole_sqm in sample code below). I am new to geoprocessing in python. I have found several threads which explain how to do this using shapely, but I'm hoping to stick to geopandas if possible. If shapely is the only way to do it, then I'm struggling with how to convert the geopandas format (geodataframe) to a shapely polygon and back again – especially because my shapefile has 15 attributes that need to remain in the final result. The examples I find all start with a very simple "Polygon(List of coordinates)," while I have a complex shapefile.

This is what I have so far:

import geopandas as gpd

# Read + explode polygon shapefile
shp = gpd.read_file(r"c:\Users\xx\Documents\Model\input\floodmap.shp").to_crs(epsg=32752)
shp_exp = shp.explode()

# Fill holes in polygons
area_min_hole_sqm = 100000
shp_filled = ???

# Dissolve and write polygon shapefile
shp_filled_diss = shp_filled.dissolve()
shp_filled_diss.to_file(r"c:\Users\xx\Documents\Model\output\floodmap_filled.shp", index=False)

Best Answer

List rings (holes), convert them to polygons, measure their areas, union the small ones with the original geometry:

import geopandas as gpd
from shapely.geometry import Polygon
from functools import reduce

sizelim = 1000 #Fill holes less than 1000 m2

df = gpd.read_file(r"C:\Polygons_with_holes.shp")

def fillit(row):
    """A function to fill holes below an area threshold in a polygon"""
    newgeom=None
    rings = [i for i in row["geometry"].interiors] #List all interior rings
    if len(rings)>0: #If there are any rings
        to_fill = [Polygon(ring) for ring in rings if Polygon(ring).area<sizelim] #List the ones to fill
        if len(to_fill)>0: #If there are any to fill
            newgeom = reduce(lambda geom1, geom2: geom1.union(geom2),[row["geometry"]]+to_fill) #Union the original geometry with all holes
    if newgeom:
        return newgeom
    else:
        return row["geometry"]

df["geometry"] = df.apply(fillit, axis=1)
df.to_file(r"C:\temp\filled_holes.shp")

Three were filled:

enter image description here