GeoPandas – How to Read and Import Inspire XML Data with GeoPandas

geodataframegeopandasimportinspirexml

I am trying to plot the German Railway net from here, using GeoPandas. It downloads an Inspire XML file. gpd.read_file only seems to read the header. Importing it as a Pandas Dataframe with pd.read_XML comes up with NaN's only.

How can I read Inspire XML files using GeoPandas?

Best Answer

This is a GML file with layers. To read a layer, pass the layer name:

In [6]: db = gpd.read_file("DB-Netz_INSPIRE_20200217.xml",layer="RailwayLine")
/home/rowlings/.local/lib/python3.8/site-packages/geopandas/geodataframe.py:600: RuntimeWarning: Sequential read of iterator was interrupted. Resetting iterator. This can negatively impact the performance.
  for feature in features_lst:

In [7]: db
Out[7]: 
           gml_id                     identifier beginLifespanVersion      localId  ... grammaticalNumber validFrom railwayLineCode geometry
0     Line-953080  urn:x-dbnetze:oid:Line-953080                 None  Line-953080  ...              None      None            6321     None
1     Line-953081  urn:x-dbnetze:oid:Line-953081                 None  Line-953081  ...              None      None            6406     None
2     Line-953082  urn:x-dbnetze:oid:Line-953082                 None  Line-953082  ...              None      None            2112     None
3     Line-953083  urn:x-dbnetze:oid:Line-953083                 None  Line-953083  ...              None      None            5864     None
4     Line-953084  urn:x-dbnetze:oid:Line-953084                 None  Line-953084  ...              None      None            6270     None
...           ...                            ...                  ...          ...  ...               ...       ...             ...      ...
1504  Line-954584  urn:x-dbnetze:oid:Line-954584                 None  Line-954584  ...              None      None            2666     None
1505  Line-954585  urn:x-dbnetze:oid:Line-954585                 None  Line-954585  ...              None      None            3525     None
1506  Line-954586  urn:x-dbnetze:oid:Line-954586                 None  Line-954586  ...              None      None            2901     None
1507  Line-954587  urn:x-dbnetze:oid:Line-954587                 None  Line-954587  ...              None      None            1923     None
1508  Line-954588  urn:x-dbnetze:oid:Line-954588                 None  Line-954588  ...              None      None            6070     None

I'm not sure how to get all the layer names in python, but ogrinfo at the command line will do this for me:

INFO: Open of `DB-Netz_INSPIRE_20200217.xml'
      using driver `GML' successful.
1: Network (None)
2: ConditionOfFacility (None)
3: MarkerPost (Point)
4: TrafficFlowDirection (None)
5: VerticalPosition (None)
6: DesignSpeed (None)
7: NominalTrackGauge (None)
8: NumberOfTracks (None)
9: RailwayElectrification (None)
10: RailwayLine (None)
11: RailwayLink (Line String)
12: RailwayLinkSequence (None)
13: RailwayNode (Point)
14: RailwayStationNode (Point)
15: RailwayStationCode (None)
16: RailwayType (None)
17: RailwayUse (None)

What you are seeing without a layer option is the first layer, which has no geometry. How the various layers relate will be in the documentation.

Related Question