I am trying to write a python script which pulls GML tags out of an XML file and formats them into WKT for inserting into a PostGIS database. I have been successful in doing so for an XML containing a single part polygon using the follow code:

    rootElement = ET.parse("GMLExample_Polygon.xml").getroot()
    wkt = ""
    for subelement in rootElement.getiterator():
        for subsub in subelement:
            if subsub.tag == "{}X":
                x = subsub.text
            if subsub.tag == "{}Y":
                y = subsub.text
                point_for_pol = "%s %s, " % (x, y)
                wkt += point_for_pol
    wkt = wkt[:-2]

This code clearly won't work for multipolygons. I am unsure how to access the geometry for each polygon tag ("gml:Polygon srsName="BNG") separately and pull only the geometry nested under it. I am trying to use ElementTree, not sure if this is the best module to use? The XML is structured as follows:

<Order xsi:noNamespaceSchemaLocation="">
            <Street>long street</Street>
            <PostCode>PN1 1PN</PostCode>
            <gml:Polygon srsName="BNG">
            <gml:Polygon srsName="BNG">

Thanks for any help.

Best Answer

I enjoy using ElementTree. It's standardized in Python since 2.5 as xml.etree.ElementTree. Forgive me for being blunt, but you're using it wrong. I suggest trying the find, findtext, and findall methods when you know the structure of the data. Is Order your root element? If so,

>>> geography = rootElement.find('OrderRequest/SiteGeography')
>>> for polygon in geography.findall('{}Polygon'):
...     for coord in polygon.findall(
...             "{}outerBoundaryIs/"
...             "{}LinearRing/"
...             "{}coord"):
...         print(
...             coord.findtext("{}X"),
...             coord.findtext("{}Y"))
('452847.6009', '18596.0496')
('415847.6009', '184596.0496')
('415847.6009', '184596.0496')
('452847.6009', '18596.0496')
('415847.6009', '184596.0496')
('415847.6009', '184596.0496') has more advice on using ElementTree.

