Geopackage – How to Read Geopackage Geometries in Python

geometrygeopackagepythonwell-known-binary

Let me state up front that I'm totally new to the field of GIS. I'm doing a project involving a database of buildings and their geometries in my home country (the Netherlands). At the moment, I'm trying to read geometry data from a geopackage using a python script.

If I understand the geopackage specification correctly, geometry data in a geopackage is a form of a well-known-binary. I tried to find a python package that can translate the binaries in the geopackage to usable geometries. I.e., I want to be able extract the points/linestrings/etc from the binaries.

I thought I could find a python package that would be able to do this, but I either don't understand how they work or they don't work with the data I have.

As an example, I have the following binary:

binary representation of geometry

or, as represented in python:

b'\x40\x71\x00\x00\x71\x3d\x0a\xd7\x05\xc3\x0e\x41\x23\xc3\x0e\x41\x71\x3d\x0a\xd7\xe3\x1c\x22\x41\xec\x1c\x22\x41\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xeb\x03\x00\x00\x01\x00\x00\x00\xcd\xcc\xcc\xcc\x10\xc3\x0e\x41\x5c\x8f\xc2\x41\x00\x00\x00\x00\x00\x00\x00\x00\x71\x3d\x0a\x41\xcd\xcc\xcc\xcc\xe5\x1c\x22\x41\x00\x00\x00\x00\xcd\xcc\xcc\xcc\x18\xc3\x0e\x41\x71\x3d\x0a\x41\x00\x00\x00\x00\x00\x00\x00\x00\x29\x5c\x8f\x41\x52\xb8\x1e\x85\xea\x1c\x22\x41\x00\x00\x00\x00\xcd\xcc\xcc\xcc\x10\xc3\x0e\x41\x5c\x8f\xc2\x41\x00\x00\x00\x00\x00\x00\x00\x00'

When I try to read a binary using for example the shapely package, I get an error:

from shapely import wkb
geom = b'\x40\x71\x00\x00\x71\x3d\x0a\xd7\x05\xc3\x0e\x41\x23\xc3\x0e\x41\x71\x3d\x0a\xd7\xe3\x1c\x22\x41\xec\x1c\x22\x41\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xeb\x03\x00\x00\x01\x00\x00\x00\xcd\xcc\xcc\xcc\x10\xc3\x0e\x41\x5c\x8f\xc2\x41\x00\x00\x00\x00\x00\x00\x00\x00\x71\x3d\x0a\x41\xcd\xcc\xcc\xcc\xe5\x1c\x22\x41\x00\x00\x00\x00\xcd\xcc\xcc\xcc\x18\xc3\x0e\x41\x71\x3d\x0a\x41\x00\x00\x00\x00\x00\x00\x00\x00\x29\x5c\x8f\x41\x52\xb8\x1e\x85\xea\x1c\x22\x41\x00\x00\x00\x00\xcd\xcc\xcc\xcc\x10\xc3\x0e\x41\x5c\x8f\xc2\x41\x00\x00\x00\x00\x00\x00\x00\x00'
point = wkb.loads(geom)
print(point)
shapely.errors.WKBReadingError: Could not create geometry because of errors while reading input.

To read geometry data from geopackages in python, will I have to write my own parser?

Best Answer

You can use the gdal/ogr, fiona (built on gdal/ogr) or geopandas (built on fiona) python libraries.

Below is a fiona example:

import fiona

# No need to pass "layer='etc'" if there's only one layer
with fiona.open('test.gpkg', layer='layer_of_interest') as layer:
    for feature in layer:
        print(feature['geometry'])

Partial output for one record in my data:

{'type': 'MultiPolygon', 'coordinates': [[[(147.01294051, -34.75046834699997), (147.01289292100003, -34.75075388199998), (147.01251220799998, -34.75068249799993), etc...

From the fiona docs:

A record you get from a collection is a Python dict structured exactly like a GeoJSON Feature. Fiona records are self-describing; the names of its fields are contained within the data structure and the values in the fields are typed properly for the type of record. Numeric field values are instances of type int and float, for example, not strings.