GeoPandas – Casting Geometry to MULTI

geopandas

Shapefiles can mix simple POLYGONs and MULTIPOLYGONs in the same data source. Spatial databases like PostGIS and SpatiaLite are strict, and will not put a POLYGON in a MULTIPOLYGON geometry column.

I've gotten used to using ST_Multi to fix this issue. But now I am trying to use GeoPandas to do some file processing, including converting from shapefile to GeoPackage (with a bunch of stuff in the middle), and I am running into this error:

gdf.to_file("garbage.gpkg", "GPKG")
ValueError: Record's geometry type does not match collection schema's 
   geometry type: 'MultiPolygon' != 'Polygon'

Is there a GeoPandas equivalent to ST_Multi that I can use to fix the geometry before saving to the GeoPackage or SpatiaLite format?

Best Answer

TL;DR

Given a geopandas data frame gdf with mixed Polygons and MultiPolygons, the geometry column can be converted to Multi with the following:

from shapely.geometry.polygon import Polygon
from shapely.geometry.multipolygon import MultiPolygon

gdf["geometry"] = [MultiPolygon([feature]) if isinstance(feature, Polygon) \
    else feature for feature in gdf["geometry"]]

More info:

There doesn't appear to be an equivalent of PostGIS's ST_Multi that can accept a Polygon or a MultiPolygon, and casts to Multi while not harming geometries that are already Multi.

The problem with casting to Multi without checking the input type

The MultiPolygon contructor requires a list of polygons. If the feature is a Polygon, the feature has to be wrapped in list brackets: MultiPolygon([feature]). Otherwise, MultiPolygon(feature) throws the following error:

TypeError: 'Polygon' object is not iterable

If the feature is already a MultiPolygon, MultiPolygon(feature) is harmless, but MultiPolygon([feature]) will extract only one polygon part from the multipart feature, and drop all others.

The solution

Hence, the type must be determined first, and MultiPolygon only applied to non-Multi features (e.g., simple Polygons). The list comprehension above:

  1. Extracts each feature with for feature in gdf["geometry"].
  2. Checks if it is a Polygon with if isinstance(feature, Polygon).
  3. Passes Polygons only to the MultiPolygon constructor with MultiPolygon([feature]).
  4. Returns the feature untouched with else feature for features which are already MultiPolygons.
  5. Assigns back to the geometry column with gdf["geometry"] =
Related Question