[GIS] Assessing OpenStreetMap coverage for international routing

esri-coverage-formatopenstreetmaprouting

I posted the following question on the main StackOverflow website, but was informed that I might get a better response here. Hence I'm copying it below. Note however, that I have an active bounty on the original posting.


I have been using a commercial solution for route distances and travel times for North America and Western/Mid Europe. I am considering expanding the project to cover other countries – and perhaps the entire world. A very limited budget and patchy regional coverage from individual commercial providers, probably make locally-hosted OpenStreetMap the only viable option. Before someone suggests an online solution, my application requires a lot of intensive route calculation – something which would cost a lot or be very impolite (and probably banned) if performed using a web service. The results of the calculations are put back in the public domain, so rediting OpenStreetMaps is not a problem.

My problem is how do I assess the routing data coverage for individual countries in the OpenStreetMap database? Such an assessment could determine if the project is viable, and a suitable order for processing (ie. do the countries with the best coverage first).

High-end commercial data providers can typically supply statistical descriptions, as well as regional descriptions of surveyed coverage. OpenStreetMap is much more patchy – an area typically includes some roads, but not all roads. Individual location errors of a few metres of even 10-20m will not be a problem for my application (I'm looking at city-city distances), but route graph connectivity is. Ie. the road vectors must logically meet correctly at a junction.

Has anyone attempted to create statistics describing data coverage of the OpenStreetMap database?

If not, how would you go about it?

The best I can think of is to take a random sampling of places (eg. cities), and then attempt to calculate routes. There would have to be an assumption that the major roads will tend to be added before the minor roads. Therefore a route between two distant cities would use the logical major road, and not a minor road (which is typically longer/slower) because the major road is missing.

Another problem would be that it is physically not possible to drive between many towns. Often this is due to the presence of islands (where ferries could be used) but often there is no surface route (eg. settlements in Nunavut). So how would such statistics be used when comparing between (say) Tonga and Afghanistan. Afganistan probably has very low data coverage. Tonga is probably better but the settlements are spread out across an archipelago.

Some details about my application: All start and end points are towns and cities with locations taken from the Geonames database. Typically I am looking at the 1000 largest cities in a country that also have a population of at least 1000. Routes are currently calculated in duplicate as both fastest routes and shortest routes. Reasonable road speeds vary according to broad road categories. Estimated travel times are computed alongside road distances. These details are preferences for consistency- they are not set in stone.

Best Answer

There's no statistics on routing quality in different countries available, as far as I'm aware. The best thing you can do is try to measure things indirectly.

I organised a project to do this in the USA on OpenStreetMap during 2009. For full details see the 250 cities project on the OpenStreetMap wiki

First we measured how many of the 250 largest cities in the country were routable, in a large matrix of every possible start/finish point. Secondly, we found places where the outward/backward routes were significantly different lenghts. Then we looked at routes where the distance was greater than 1.4x of the great-circle route, as an indication of likely routing problems.

You could use a similar approach to estimate routing quality in your countries of interest. Your assumption that major cities and major roads are mapped first generally holds true for OpenStreetMap, in my experience.

Related Question