GeoPandas – Resolving ‘LineStrings Must Have At Least 2 Coordinate Tuples’ Error in GPX Files

geopandaspythonshapely

I am trying to use geopandas to read in some gpx files (generated from a Garmin vista HCX). For many gpx files, given this code:

>>> import geopandas as gpd
>>> data = gpd.read_file('filename.gpx', layer='tracks')

I am seeing the following traceback:

Traceback (most recent call last):
  File "shapely/speedups/_speedups.pyx", line 86, in shapely.speedups._speedups.geos_linestring_from_py
AttributeError: 'list' object has no attribute '__array_interface__'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/geopandas/io/file.py", line 96, in read_file
    gdf = GeoDataFrame.from_features(f_filt, crs=crs, columns=columns)
  File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/geopandas/geodataframe.py", line 233, in from_features
    d = {'geometry': shape(f['geometry']) if f['geometry'] else None}
  File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/shapely/geometry/geo.py", line 42, in shape
    return MultiLineString(ob["coordinates"])
  File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/shapely/geometry/multilinestring.py", line 52, in __init__
    self._geom, self._ndim = geos_multilinestring_from_py(lines)
  File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/shapely/geometry/multilinestring.py", line 134, in geos_multilinestring_from_py
    geom, ndims = linestring.geos_linestring_from_py(obs[l])
  File "shapely/speedups/_speedups.pyx", line 152, in shapely.speedups._speedups.geos_linestring_from_py
ValueError: LineStrings must have at least 2 coordinate tuples

This happens for many gpx files but not all gpx files. I'm not sure what to do about this. I don't have any control over how the files are generated. Viking and QGIS are able to read them without a problem.

I'm using Geopandas version 0.5.0 and Shapely version 1.7a1.

I can reproduce this problem by using fiona and shapely directly, like this:

>>> import fiona
>>> import shapely.geometry
>>> layer = fiona.open('20190530.gpx', layer='tracks')
>>> data = {'type': 'MultiLineString', 'coordinates': layer[0]['geometry']['coordinates']}
>>> shp = shapely.geometry.shape(data)
Traceback (most recent call last):
  File "shapely/speedups/_speedups.pyx", line 86, in shapely.speedups._speedups.geos_linestring_from_py
AttributeError: 'list' object has no attribute '__array_interface__'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/shapely/geometry/geo.py", line 42, in shape
    return MultiLineString(ob["coordinates"])
  File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/shapely/geometry/multilinestring.py", line 52, in __init__
    self._geom, self._ndim = geos_multilinestring_from_py(lines)
  File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/shapely/geometry/multilinestring.py", line 134, in geos_multilinestring_from_py
    geom, ndims = linestring.geos_linestring_from_py(obs[l])
  File "shapely/speedups/_speedups.pyx", line 152, in shapely.speedups._speedups.geos_linestring_from_py
ValueError: LineStrings must have at least 2 coordinate tuples
>>>

If we look at the offending feature, we see:

>>> layer[0]['geometry']['coordinates']
[[(-71.162904, 42.384345)]]

…and indeed, it's hard to represent a line with a single point. But what's the workaround here? It doesn't look like I can delete individual features after reading the data with fiona:

>>> del layer[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'Collection' object doesn't support item deletion

I'm not sure how else to pre-process this data so that I can get it into a GeoDataFrame.

Best Answer

I don't know if this is The Right Way of handling this, but I came up with the following solution:

import io
import fiona
import geopandas as gpd


def read_tracks_filtered(path):
    src = fiona.open(path, layer='tracks')
    meta = src.meta
    meta['driver'] = 'GeoJSON'

    with io.BytesIO() as buffer:
        with fiona.open(buffer, 'w', **meta) as dst:
            for i, feature in enumerate(src):
                if len(feature['geometry']['coordinates'][0]) > 1:
                    dst.write(feature)

        buffer.seek(0)
        df = gpd.read_file(buffer, driver='GeoJSON')

    return df

This will iterate over the features in the tracks layer of GPX file and discard those which only have a single point (in the first track segment). It generates an in-memory GeoJSON representation of the data, which we can pass to geopandas.read_file.