GeometryCollection causing RuntimeError in Python

geometrygeopandaspythonruntimeerror

I am trying to run a loop using my large and small polygon file, however, I experience a RuntimeError:

RuntimeError: GDAL Error: Attempt to write non-polygon (LINESTRING)
geometry to POLYGON type shapefile. Failed to write record: {'id':
'11416', 'type': 'Feature', 'properties': {'FIDi': 1191, 'ID': 11416},
'geometry': {'type': 'GeometryCollection', 'geometries': [{'type':
'LineString', 'coordinates': ((55081…))

The loop looks something like this and what I try to do, is go through the large polygons, find each small polygon within and calculate the area. The result is a shapefile with small polygons. However, when I try to export the file as a shapefile (appended_concat.to_file(r'E:\...\result.shp') I got the error above. When I export as GPKG, everything works. I was pretty sure that polygon geometry is either POLYGON or MULTIPOLYGON, but cannot understand why I got GEOMETRYCOLLECTION.

import geopandas as gpd
import pandas as pd

large = gpd.read_file(r'E:\...\Polygons.shp')
small = gpd.read_file(r'E:\...\Small_polygon.shp')
points = gpd.read_file(r'E:\...\points.shp')

# check if there are any invalid geometries
list = [large, small, points]
for n in list:
    valid_gdf = n[~n.geometry.is_valid]
    print(n.shape)
    print(valid_gdf.shape)

(19726, 3)
(0, 3)
(2415, 285)
(0, 285)
(1096029, 5)
(0, 5)


appended = []

for i in large.index:

    df = small[small['FID'] == i]
    df_result = gpd.overlay(large, df, how='intersection')
    df_result['area'] = round(df_result.area / 10000, 4)
    df_result_2 = df_result[df_result['FID'] == i]
    print(df_result_2)

    appended.append(small)
    appended_concat = pd.concat(appended)
    appended_concat.to_file(r'E:\...\result.shp')

    appended_concat.to_file(r'E:\...\result.gpkg', driver='GPKG')

And when I import the GPKG file into QGIS, I can see that some of the polygons are GeometryCollection.

How can I transform to polygon or somehow correct the GeometryCollection figures?

enter image description here

Technically, I could just use the GPKG file format which does not cause the error. However, I would like to understand why I cannot export a shapefile and why the GeometryCollection figures appears in general?

Best Answer

intersection process must have returned different geometries. In this case, it is usual to get GeometryCollection. For layers in the image, intersection may return all geometry types (point, line, polygon).

enter image description here You get the error because of trying to save geometry collection as shapefile. Shapefile contains only one type of geometry.

To avoid that, use keep_geom_type=True in overlay function.

df_result = gpd.overlay(large, df, how='intersection', keep_geom_type=True)
Related Question