Python – How to Remove Small Holes from a Polygon Using GDAL and Shapely

gdalgeometrypolygonpythonshapely

I am writing a post-processing flow on Python. I am trying to come up with an algorithm to remove the small holes from the polygon.

I am using osgeo OGR right now, but I'm ok to do this part with shapely instead.

Currently, my solution is using Buffer like: geometry.Buffer(0.2 * 1 / 111111)

This works most of the time, but sometimes GDAL eats the whole memory on the machine and just crashes. I guess this is not optional at all. How can I accomplish my desired result?

The long gap on the right must persist, only the holes that are smaller the certain threshold have to be removed.

Polython with small holes

Best Answer

You can use the following example script. The script removes all holes from a polygon whose area is smaller than eps.

from shapely import wkt
from shapely.geometry import Polygon

# sample polygon
polygon_wkt = 'Polygon ((670651.58235156117007136 4087356.72129086637869477, 670941.20806862262543291 4087351.57990169199183583, 670944.42234629089944065 4087244.82082350691780448, 670648.40824284462723881 4087249.48037469061091542, 670651.58235156117007136 4087356.72129086637869477),(670669.30897117499262094 4087340.80642584012821317, 670671.69532915309537202 4087328.12050161883234978, 670709.06604568683542311 4087324.62686039274558425, 670709.12268763419706374 4087339.48419455718249083, 670669.30897117499262094 4087340.80642584012821317),(670670.63015638350043446 4087310.41313102655112743, 670670.24676115158945322 4087294.13430501008406281, 670711.11178680183365941 4087293.54062445322051644, 670711.46667919436004013 4087311.23376588197425008, 670670.63015638350043446 4087310.41313102655112743),(670669.74969205493107438 4087283.51269924966618419, 670668.85500870971009135 4087257.31934182625263929, 670755.23019460134673864 4087254.81062116287648678, 670754.28513388405553997 4087284.15052131982520223, 670669.74969205493107438 4087283.51269924966618419),(670796.18594406009651721 4087338.05096173053607345, 670797.80696410103701055 4087257.43525168858468533, 670807.4089543444570154 4087256.92088755592703819, 670806.12872222356963903 4087338.25089854281395674, 670796.18594406009651721 4087338.05096173053607345),(670821.05711140809580684 4087337.8436735225841403, 670824.12722124985884875 4087255.84220110531896353, 670929.67811466054990888 4087253.72090833634138107, 670930.47750282462220639 4087337.56890620524063706, 670821.05711140809580684 4087337.8436735225841403))'
polygon = wkt.loads(polygon_wkt)

list_interiors = []
eps = 1000

for interior in polygon.interiors:
    p = Polygon(interior)    
    if p.area > eps:
        list_interiors.append(interior)

new_polygon = Polygon(polygon.exterior.coords, holes=list_interiors)

Before:
enter image description here

After:
enter image description here


For a multipolygon, you should iterate over the list of its polygons as well.

from shapely import wkt
from shapely.geometry import Polygon, MultiPolygon

multipolygon_wkt = 'MultiPolygon (((670651.58235156117007136 4087356.72129086637869477, 670941.20806862262543291 4087351.57990169199183583, 670944.42234629089944065 4087244.82082350691780448, 670648.40824284462723881 4087249.48037469061091542, 670651.58235156117007136 4087356.72129086637869477),(670669.30897117499262094 4087340.80642584012821317, 670671.69532915309537202 4087328.12050161883234978, 670709.06604568683542311 4087324.62686039274558425, 670709.12268763419706374 4087339.48419455718249083, 670669.30897117499262094 4087340.80642584012821317),(670670.63015638350043446 4087310.41313102655112743, 670670.24676115158945322 4087294.13430501008406281, 670711.11178680183365941 4087293.54062445322051644, 670711.46667919436004013 4087311.23376588197425008, 670670.63015638350043446 4087310.41313102655112743),(670669.74969205493107438 4087283.51269924966618419, 670668.85500870971009135 4087257.31934182625263929, 670755.23019460134673864 4087254.81062116287648678, 670754.28513388405553997 4087284.15052131982520223, 670669.74969205493107438 4087283.51269924966618419),(670796.18594406009651721 4087338.05096173053607345, 670797.80696410103701055 4087257.43525168858468533, 670807.4089543444570154 4087256.92088755592703819, 670806.12872222356963903 4087338.25089854281395674, 670796.18594406009651721 4087338.05096173053607345),(670821.05711140809580684 4087337.8436735225841403, 670824.12722124985884875 4087255.84220110531896353, 670929.67811466054990888 4087253.72090833634138107, 670930.47750282462220639 4087337.56890620524063706, 670821.05711140809580684 4087337.8436735225841403)),((670648.88963292469270527 4087236.86040469771251082, 670943.25558728328906 4087232.40550666907802224, 670942.91290281957481056 4087129.60016754502430558, 670646.83352614217437804 4087133.71238110959529877, 670648.88963292469270527 4087236.86040469771251082),(670667.39459396700840443 4087217.67007472785189748, 670718.45457906532101333 4087218.35544365551322699, 670716.3984722828026861 4087206.36148742400109768, 670667.73727843072265387 4087207.04685635166242719, 670667.39459396700840443 4087217.67007472785189748),(670666.02385611203499138 4087192.65410887403413653, 670736.27417118020821363 4087192.65410887403413653, 670734.90343332523480058 4087158.38566249934956431, 670665.3384871844900772 4087159.07103142701089382, 670666.02385611203499138 4087192.65410887403413653),(670752.72302544000558555 4087176.54793907795101404, 670773.96946219238452613 4087175.17720122309401631, 670770.19993309106212109 4087158.72834696341305971, 670750.32423419377300888 4087157.70029357215389609, 670752.72302544000558555 4087176.54793907795101404),(670794.18784555338788778 4087175.86257015075534582, 670922.00915053102653474 4087176.20525461435317993, 670919.26767482107970864 4087139.19533252995461226, 670794.18784555338788778 4087140.90875484840944409, 670794.18784555338788778 4087175.86257015075534582),(670806.52448624826502055 4087215.61396794533357024, 670825.02944729058071971 4087214.92859901767224073, 670821.94528711691964418 4087184.08699728036299348, 670803.09764161077328026 4087183.40162835316732526, 670806.52448624826502055 4087215.61396794533357024),(670838.73682584054768085 4087214.24323009047657251, 670923.72257284971419722 4087213.55786116281524301, 670922.69451945857144892 4087182.37357496190816164, 670838.05145691300276667 4087186.48578852694481611, 670838.73682584054768085 4087214.24323009047657251)))'
multipolygon = wkt.loads(multipolygon_wkt)

list_parts = []
eps = 1000

for polygon in multipolygon.geoms:
    list_interiors = []
    
    for interior in polygon.interiors:
        p = Polygon(interior)

        if p.area > eps:
            list_interiors.append(interior)

    temp_pol = Polygon(polygon.exterior.coords, holes=list_interiors)
    list_parts.append(temp_pol)
    
new_multipolygon = MultiPolygon(list_parts)

Before:

enter image description here

After:

enter image description here