[GIS] Shapely polygon union results in strange artifacts of tiny, non-overlapping areas

geopandaspolygonpythonshapelyunion

I am not sure if this is a shapely bug or a problem with my polygons.

I have a bunch of polygons that look like this:
enter image description here
They are level-12 watersheds provided by HydroBasins, and are presumably non-overlapping.

I do some analysis to select a group of these polygons based on their basin ID, then join that group together to make a single, large polygon:
enter image description here

I have done this in four other areas without problem. However, for this polygon, I find that it actually has tiny holes, but in unusual locations:
enter image description here

Here's a zoom-view:
enter image description here

In this view, the yellow lines represent the border of the large polygon:
enter image description here

I should note that when displaying in QGIS, no matter how far I zoom in, the smaller polygons do not have gaps between them.

What I find unusual is that these tiny gaps seem to occur at roughly constant lines of latitude. The region is Arctic (i.e. high-latitudes), so I am wondering if they appear due to the WGS84 distortions at higher latitudes.

I am using geopandas to manage and select the smaller polygons, and shapely to join them together:

polys = []
for i in idcs:
    polys.append(gdf.iloc[i].geometry)

polyout = shapely.ops.unary_union(polys)

where "gdf" is a pandas geodataframe and "idcs" are the indices of smaller polygons I want to join together.

How can I tell if this is a shapely bug or an issue with my smaller polygons, and either way, what is the best approach to remove these? I could buffer each of the individual polygons by a very small amount before union-ing, but I would like to avoid that if possible.

Best Answer

This is typically a result of the the borders not fitting perfectly one next to another (and this is very easy to get with floating point coordinates).

As an example, I use the world dataset available in geopandas, and take the unary union of the Africa continent:

import geopandas
gdf = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
gdf_africa = gdf[gdf['continent'] == 'Africa']
africa = gdf_africa.geometry.unary_union
# only take main continent, disregard Madagascar
africa = list(africa)[1]

You can see on the following plot that also this union is not perfect:

enter image description here

This is a typical problem to encounter when doing unary unions, so now how to solve this:

  1. One possible solution is to take the exterior of this polygon (so disregarding the interior holes) (taken from this answer), and create a new polygon of it:

    from shapely.geometry import Polygon
    africa2 = Polygon(africa.exterior)
    

    which gives this result:

    enter image description here

    which is already much better. There is still a small distortion at the left, because such a 'hole' touched the overall border.

  2. Another solution is to apply a very small buffer to the polygon, and undo it again (taken from this answer)

    from shapely.geometry import JOIN_STYLE
    eps = 0.00001
    africa3 = africa.buffer(eps, 1, join_style=JOIN_STYLE.mitre).buffer(-eps, 1, join_style=JOIN_STYLE.mitre)
    

    which gives:

    enter image description here

    In this case, this solution looks better from the figure (no remaining distortion on the left), but this method can slightly change the coordinates (although this is often negligible).

Which of the methods works best for you, probably depends on the actual case.