QGIS INSPIRE WFS – How to Extract Features from XML Response in QGIS

inspireogr2ogrqgiswfsxml

Context

In QGIS, I add OpenStreetMap as an XYZ layer. I zoom to the Belgian-Dutch-German tripoint. I press F12 (fn-F12 on Mac) to get Network Logger up. I add a WFS connection to visualize North Rhine-Westphalia's parcels (source website):

https://www.wfs.nrw.de/geobasis/wfs_nw_inspire-flurstuecke_alkis

I choose the Flurstück layer. Display at this point:

enter image description here

I would like to reproduce the creation of the Flurstück features using the link to one of the GET requests. The last request:

995 GET
https://www.wfs.nrw.de/geobasis/wfs_nw_inspire-flurstuecke_alkis?SERVICE=WFS&REQUEST=GetFeature&VERSION=2.0.0&TYPENAMES=cp:CadastralParcel&COUNT=100000&SRSNAME=urn:ogc:def:crs:EPSG::3857&BBOX=672989.01836707652546465,6580237.92264114506542683,676336.63031239178963006,6583675.5681595616042614,urn:ogc:def:crs:EPSG::3857&NAMESPACES=xmlns(cp,http://inspire.ec.europa.eu/schemas/cp/4.0)&NAMESPACE=xmlns(cp,http://inspire.ec.europa.eu/schemas/cp/4.0)


Attempt 1: ogr2ogr

Set the link as an environment variable:

URL="https://www.wfs.nrw.de/geobasis/wfs_nw_inspire-flurstuecke_alkis?SERVICE=WFS&REQUEST=GetFeature&VERSION=2.0.0&TYPENAMES=cp:CadastralParcel&COUNT=100000&SRSNAME=urn:ogc:def:crs:EPSG::3857&BBOX=672989.01836707652546465,6580237.92264114506542683,676336.63031239178963006,6583675.5681595616042614,urn:ogc:def:crs:EPSG::3857&NAMESPACES=xmlns(cp,http://inspire.ec.europa.eu/schemas/cp/4.0)&NAMESPACE=xmlns(cp,http://inspire.ec.europa.eu/schemas/cp/4.0)"

Download response:

curl $URL -o resp.xml

Inspect resp.xml:

enter image description here

Reassuringly, it contains parcel polygons. Try to extract them with ogr2ogr:

ogr2ogr parcels.geojson resp.xml CadastralParcel

Output (via head -c2000 parcels.geojson):

{
"type": "FeatureCollection",
"name": "CadastralParcel",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:EPSG::3857" } },
"features": [
{ "type": "Feature", "properties": { "gml_id": "CadastralParcel_05417802400487______", "identifier": "https://registry.gdi-de.org/id/de.nw.inspire.cp.alkis/CadastralParcel_05417802400487______", "areaValue": 64.0, "beginLifespanVersion": "2013-03-19T15:07:34Z", "endLifespanVersion": "", "localId": "CadastralParcel_05417802400487______", "namespace": "https://registry.gdi-de.org/id/de.nw.inspire.cp.alkis", "label": 487, "nationalCadastralReference": "05417802400487______", "validFrom": "1992-01-01", "validTo": "", "zoning": "" }, "geometry": { "type": "Point", "coordinates": [ 673289.935999999986961, 6582283.457999999634922 ] } },
{ "type": "Feature", "properties": { "gml_id": "CadastralParcel_05417802400652______", "identifier": "https://registry.gdi-de.org/id/de.nw.inspire.cp.alkis/CadastralParcel_05417802400652______", "areaValue": 9.0, "beginLifespanVersion": "2013-03-19T15:07:34Z", "endLifespanVersion": "", "localId": "CadastralParcel_05417802400652______", "namespace": "https://registry.gdi-de.org/id/de.nw.inspire.cp.alkis", "label": 652, "nationalCadastralReference": "05417802400652______", "validFrom": "2010-01-01", "validTo": "", "zoning": "" }, "geometry": { "type": "Point", "coordinates": [ 673568.26500000001397, 6582331.541000000201166 ] } },
{ "type": "Feature", "properties": { "gml_id": "CadastralParcel_05417802400378______", "identifier": "https://registry.gdi-de.org/id/de.nw.inspire.cp.alkis/CadastralParcel_05417802400378______", "areaValue": 369.0, "beginLifespanVersion": "2015-05-05T15:35:02Z", "endLifespanVersion": "", "localId": "CadastralParcel_05417802400378______", "namespace": "https://registry.gdi-de.org/id/de.nw.inspire.cp.alkis", "label": 378, "nationalCadastralReference": "05417802400378______", "validFrom": "1985-01-01", "validTo": "", "zoning": "" }, "geometry": { "type": "Point", "coo

Unfortunately, parcels.geojson only contains reference points to parcels, and not the parcel coordinates themselves.


Attempt 2: BeautifulSoup, Shapely and GeoPandas

This is clearly a suboptimal workaround. Using Python:

import requests
import shapely.wkt
import geopandas as gpd
from bs4 import BeautifulSoup

soup = BeautifulSoup(requests.get(
    "https://www.wfs.nrw.de/geobasis/wfs_nw_inspire-flurstuecke_alkis?SERVICE=WFS&REQUEST=GetFeature&VERSION=2.0.0&TYPENAMES=cp:CadastralParcel&COUNT=100000&SRSNAME=urn:ogc:def:crs:EPSG::3857&BBOX=669448.51340211916249245,6576908.11438542790710926,673143.3303862038301304,6581970.96679305844008923,urn:ogc:def:crs:EPSG::3857&NAMESPACES=xmlns(cp,http://inspire.ec.europa.eu/schemas/cp/4.0)&NAMESPACE=xmlns(cp,http://inspire.ec.europa.eu/schemas/cp/4.0)"
).content, "xml")

coordstrings = [each.text for each in soup.findAll("gml:posList")]

new_coordstrings=[]
for coordstring in coordstrings:
    new_coordstring = ''
    coordstring_split = coordstring.split(' ')
    for index, each in enumerate(coordstring_split):
        new_coordstring += (each)
        if index!=len(coordstring_split)-1:
            if index%2==1:
                new_coordstring += ', '
            else:
                new_coordstring += ' '
    new_coordstrings.append(new_coordstring)

polygon_strings = [f"POLYGON(({new_coordstring}))" for new_coordstring in new_coordstrings]
res=gpd.GeoSeries([shapely.wkt.loads(polygon_string) for polygon_string in polygon_strings],crs=3857)

res.to_file("res.geojson")

Visualize res.geojson in QGIS:

enter image description here

So the script did the job, but it is clear this is not what QGIS is doing under the hood.


Question

How to replicate QGIS and extract the parcel data from the XML file?

I am certain QGIS does not use BeautifulSoup and such workarounds as I did above. I suspect it uses ogr2ogr. If there is a simple programmatic way of extracting the parcel polygons, I am interested even if QGIS is doing it differently.

Having an answer to this question will help me solve a different but related question of mine.

Best Answer

I used your curl command for downloading test data. Then I checked the layers with ogrinfo

ogrinfo GMLAS:inspire.xml 

One of the layers is this

8: cadastralparcel (Unknown (any), Point)

It seems to contain two geometries, first is of some undefined geometry type, the second one is point. GeoJSON can only have one geometry field and it is not sure what will happen in the conversion but we can try.

ogr2ogr -f GeoJSON inspirecp.json GMLAS:inspire.xml cadastralparcel

Check the result:

ogrinfo inspirecp.json -al |more

INFO: Open of `inspirecp.json'
      using driver `GeoJSON' successful.

Layer name: cadastralparcel
Geometry: Polygon
Feature Count: 3858
Extent: (671494.159000, 6579594.587000) - (676942.858000, 6585030.326000)
...

At least there are now parcel polygons in a GeoJSON file.