Python – How to Handle DataIncomplete Error in Overpy Query Result

overpass-apioverpass-turbopython

Recently I have asked this question, and I got this answer, which was very helpful in identifying lakes using overpass-turbo.eu. Indeed, if I run the query, I get the desired result. An example, where I have clicked on a fountain is below (Oslo was in the centre of the map when I run the query).

enter image description here

I want to run the same query using Python. I define a bounding box:

bbox = {'latLower':59.9,'lonLower':10.7,'latHigher':60.0,'lonHigher': 10.8}

This bounidng box covers an area in Oslo. Then construct the text of the query:

querytext=\
f'''
/*
From: https://gis.stackexchange.com/a/412100/191643

This has been generated by the overpass-turbo wizard.
The original search was:
“(natural=water and type=multipolygon) or natural=water”
*/
[out:json][timeout:25];
// gather results
(
  // query part for: “natural=water and type=multipolygon”
  node["natural"="water"]["type"="multipolygon"]{overpy_bbox};
  way["natural"="water"]["type"="multipolygon"]{overpy_bbox};
  relation["natural"="water"]["type"="multipolygon"]{overpy_bbox};
  // query part for: “natural=water”
  node["natural"="water"]{overpy_bbox};
  way["natural"="water"]{overpy_bbox};
  relation["natural"="water"]{overpy_bbox};
);
// print results
out body;
'''

Then the query itself using overpy:

import overpy
api = overpy.Overpass()
result = api.query(querytext)

I access all the ways by:

result.get_ways()

which returns a list, a partial screenshot of it:

enter image description here

I highlighted the line which contains the same ID as I had above in the overpass-turbo.eu interface. Accessing the first node, for example:

result.get_ways()[0]

Output:

<overpy.Way id=4258487 nodes=[25526443, 4731901953, 4731901954, 4731901952, 25526546, 6794539693, 25526545, 6794539694, 4731901950, 6794539695, 25526439, 6794539696, 4731901951, 25526544, 4731901957, 7576044285, 7576027484, 25526543, 4731901958, 25526542, 23597818, 7576027483, 5273979151, 4731901959, 23597804, 4751403582, 25526444, 4751403583, 4731901960, 25526445, 4751403584, 4731901961, 4751403585, 25526446, 4751403586, 25526448, 4731901962, 4751403587, 25526452, 347545612, 347545599, 25526454, 23597806, 25526474, 8690362851, 25526558, 4751403592, 25526557, 25526554, 25526553, 25526552, 25526551, 25526498, 25526481, 4731901943, 25526475, 25526456, 23597809, 4731901944, 4731901945, 23597811, 4731901948, 4731901949, 4731901947, 4731901946, 23597814, 4731901963, 6839492557, 6839492556, 23597816, 6839492559, 25526549, 6839492558, 25526550, 7576044287, 25526548, 7576044286, 4731901956, 25526547, 4731901955, 25526443]>

Then I want to access the nodes itself:

result.get_ways()[0].nodes

But this gives me an error:

---------------------------------------------------------------------------
DataIncomplete                            Traceback (most recent call last)
<ipython-input-141-59cf0ff05ef6> in <module>()
----> 1 result.get_ways()[0].nodes

1 frames
/usr/local/lib/python3.7/dist-packages/overpy/__init__.py in nodes(self)
    894         List of nodes associated with the way.
    895         """
--> 896         return self.get_nodes()
    897 
    898     def get_nodes(self, resolve_missing=False):

/usr/local/lib/python3.7/dist-packages/overpy/__init__.py in get_nodes(self, resolve_missing)
    921 
    922             if not resolve_missing:
--> 923                 raise exception.DataIncomplete("Resolve missing nodes is disabled")
    924 
    925             # We tried to resolve the data but some nodes are still missing

DataIncomplete: ('Data incomplete try to improve the query to resolve the missing data', 'Resolve missing nodes is disabled')

So I cannot get data out of the query this way.

How can I access the data which is displayed on the overpass-turbo.eu interface (for example, location of nodes), using Python?

Ideally, I'd like to have shapely polygons / multipolygons for the nodes forming part of a way.

Best Answer

The key was to use get_nodes(resolve_missing=True).

The full code is below.

Install overpy:

%%shell
pip install overpy
pip install geopandas

Imports:

import overpy
from overpy.exception import OverpassTooManyRequests, OverpassGatewayTimeout
import time
from tqdm import tqdm
import folium
from shapely.geometry.polygon import Polygon
import numpy as np
import geopandas as gpd

Define a bounding box:

bbox = {'latLower':59.9,'lonLower':10.75,'latHigher':59.91,'lonHigher': 10.775}

Create a string from it, to be used within Overpass query:

overpy_bbox=\
f"({bbox['latLower']},{bbox['lonLower']},{bbox['latHigher']},{bbox['lonHigher']})"

Create text of Overpass query, relying on this answer:

querytext=\
f'''
/*
From: https://gis.stackexchange.com/a/412100/191643

This has been generated by the overpass-turbo wizard.
The original search was:
“(natural=water and type=multipolygon) or natural=water”
*/
[out:json][timeout:25];
// gather results
(
  // query part for: “natural=water and type=multipolygon”
  node["natural"="water"]["type"="multipolygon"]{overpy_bbox};
  way["natural"="water"]["type"="multipolygon"]{overpy_bbox};
  relation["natural"="water"]["type"="multipolygon"]{overpy_bbox};
  // query part for: “natural=water”
  node["natural"="water"]{overpy_bbox};
  way["natural"="water"]{overpy_bbox};
  relation["natural"="water"]{overpy_bbox};
);
// print results
out body;
'''

Query:

api = overpy.Overpass()
result = api.query(querytext)

Process the query:

waydict={}
for way in tqdm(result.get_ways()):
    while True:
        try:
            nodes = way.get_nodes(resolve_missing=True)
            nodelist=[]
            for node in nodes:
                nodelist.append([float(node.lon), float(node.lat)])
            waydict[way.id]=nodelist
        except OverpassTooManyRequests:
            time.sleep(1)
            print('Retrying...')
            continue
        except OverpassGatewayTimeout:
            time.sleep(10)
            print('OverpassGatewayTimeout, retrying...')
            continue
        break

Process result a bit:

for id, way in waydict.items():
    polygon_obj = Polygon(np.array(way))

polygondict={}
for id, way in waydict.items():
    polygondict[id]= Polygon(np.array(way))

Create bounding box Polygon:

bbox_polygon_obj = Polygon(np.array([
                        [bbox['lonLower'],bbox['latLower']],
                        [bbox['lonLower'],bbox['latHigher']],
                        [bbox['lonHigher'],bbox['latHigher']],
                        [bbox['lonHigher'],bbox['latLower']]
                        ]))

Place results on Folium map:

m = folium.Map(location=[0.5*(bbox['latLower']+bbox['latHigher']),
                         0.5*(bbox['lonLower']+bbox['lonHigher'])], zoom_start=15, tiles='CartoDB positron')

for id, polygon in polygondict.items():
    geo_j = folium.GeoJson(data=gpd.GeoSeries(polygon).to_json(),style_function=lambda x: {'fillColor': 'red'})
    geo_j.add_to(m)

geo_j_bbox = folium.GeoJson(data=gpd.GeoSeries(bbox_polygon_obj).to_json(),style_function=lambda x: {'fillColor': 'red'})
geo_j_bbox.add_to(m)

m

A screenshot of the result:

enter image description here

Ie we got accessed the polygon data using Python & showed them on a map (with the bounding box as well): aim achieved.

Related Question