[GIS] What’s an efficient way of detecting road junctions on a map

arcgis-desktopautocadfmegmlNetwork

I may receive maps from any source for my project, and for the C++ algorithm I'm making, it's imperative for the algorithm to know where road junctions (nodes where more than two road segments meet) are. Besides, in many of the maps I get, there are roads overshooting and undershooting (ie: roads aren't properly connected). One way to correct the problem is by 'cleaning' the road network using Autocad.
The questions I have are:

  1. Is there an efficient way (any
    software which can do it?) to get my
    map in a GML format, which contains
    information about road junctions in
    the map? (maybe have a GML tag which
    says that a particular node is a
    junction)
  2. Is there any other way to 'clean'
    the road network?

Colleagues have suggested FME, but that involves writing scripts and we're not sure about whether the script would be flexible enough to cater to all maps. The only other way to detect junctions is to use brute force and find which road segments have common nodes. Would ArcGIS help? (haven't used it, but have heard of it) I'm sure there should be a better way…

Best Answer

You can analyze polylines in amazing ways by using buffers. This is usually inefficient--buffers create many additional vertices--but (a) it is a technique available in many GISes (vector or raster based) and (b) it sometimes can produce information that is otherwise hard to get.

In this case, buffering the road by a small amount and then buffering by the negative of the same amount leaves little "islands" around all bends and around all intersections. This is easy to prove geometrically.

Here is an example of a 10 m polyline buffer (gray) and its -10 m buffer (light red) in a map that is 650 m wide:

Figure 1

Now intersect the original polyline layer with these island polygons, merge the segments by island identifier, and count the pieces:

Figure 2

The light yellow segments designate the high-count pieces and the dark cyan segments designate the low-count ones. In this fashion we have (a) found all bends and intersections (including self-intersections) and near junctions (see the extreme left, where the two segments do not quite meet)) and (b) differentiated the bends from the intersections. We can find the almost-junctions by selecting the islands that contain two or more connected segments: the bends contain only connected segments.

Due to the symmetry of buffering, the centroids of the intersection islands are the points of intersection.

One beautiful aspect of this style of analysis is that it does not care how the underlying polyline is represented: it could be a single feature, it could be one feature for each line segment, or anything in between.