I am trying to use geopandas to read in some gpx files (generated from a Garmin vista HCX). For many gpx files, given this code:
>>> import geopandas as gpd
>>> data = gpd.read_file('filename.gpx', layer='tracks')
I am seeing the following traceback:
Traceback (most recent call last):
File "shapely/speedups/_speedups.pyx", line 86, in shapely.speedups._speedups.geos_linestring_from_py
AttributeError: 'list' object has no attribute '__array_interface__'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/geopandas/io/file.py", line 96, in read_file
gdf = GeoDataFrame.from_features(f_filt, crs=crs, columns=columns)
File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/geopandas/geodataframe.py", line 233, in from_features
d = {'geometry': shape(f['geometry']) if f['geometry'] else None}
File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/shapely/geometry/geo.py", line 42, in shape
return MultiLineString(ob["coordinates"])
File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/shapely/geometry/multilinestring.py", line 52, in __init__
self._geom, self._ndim = geos_multilinestring_from_py(lines)
File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/shapely/geometry/multilinestring.py", line 134, in geos_multilinestring_from_py
geom, ndims = linestring.geos_linestring_from_py(obs[l])
File "shapely/speedups/_speedups.pyx", line 152, in shapely.speedups._speedups.geos_linestring_from_py
ValueError: LineStrings must have at least 2 coordinate tuples
This happens for many gpx files but not all gpx files. I'm not sure what to do about this. I don't have any control over how the files are generated. Viking and QGIS are able to read them without a problem.
I'm using Geopandas version 0.5.0 and Shapely version 1.7a1.
I can reproduce this problem by using fiona
and shapely
directly, like this:
>>> import fiona
>>> import shapely.geometry
>>> layer = fiona.open('20190530.gpx', layer='tracks')
>>> data = {'type': 'MultiLineString', 'coordinates': layer[0]['geometry']['coordinates']}
>>> shp = shapely.geometry.shape(data)
Traceback (most recent call last):
File "shapely/speedups/_speedups.pyx", line 86, in shapely.speedups._speedups.geos_linestring_from_py
AttributeError: 'list' object has no attribute '__array_interface__'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/shapely/geometry/geo.py", line 42, in shape
return MultiLineString(ob["coordinates"])
File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/shapely/geometry/multilinestring.py", line 52, in __init__
self._geom, self._ndim = geos_multilinestring_from_py(lines)
File "/home/lars/.local/share/virtualenvs/gps-TT0qNP_P/lib/python3.7/site-packages/shapely/geometry/multilinestring.py", line 134, in geos_multilinestring_from_py
geom, ndims = linestring.geos_linestring_from_py(obs[l])
File "shapely/speedups/_speedups.pyx", line 152, in shapely.speedups._speedups.geos_linestring_from_py
ValueError: LineStrings must have at least 2 coordinate tuples
>>>
If we look at the offending feature, we see:
>>> layer[0]['geometry']['coordinates']
[[(-71.162904, 42.384345)]]
…and indeed, it's hard to represent a line with a single point. But what's the workaround here? It doesn't look like I can delete individual features after reading the data with fiona
:
>>> del layer[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Collection' object doesn't support item deletion
I'm not sure how else to pre-process this data so that I can get it into a GeoDataFrame.
Best Answer
I don't know if this is The Right Way of handling this, but I came up with the following solution:
This will iterate over the features in the
tracks
layer of GPX file and discard those which only have a single point (in the first track segment). It generates an in-memory GeoJSON representation of the data, which we can pass togeopandas.read_file
.