[GIS] How to get the two nearest points between two polygons

arcpyogrshapely

I'm trying to draw the nearest line between the nearest polygons in a Polygon dataset (ESRI Shapefile). I know that for this purpose there is the Near Tool (ArcGIS), but it just return the point TO which to connect but not the point FROM the first polygon and I don't want the centroid.

So I tried to do my own implementation and surprisingly it is running faster than Near Tool (at least for my huge data), and it is performing this way:

  1. Get a cursor and loop through all polygons
  2. For each polygon in cursor get another cursor2 from polygons which are in certain range from the extents
  3. Then it just calculate distances using built-in functions and get the closest one
  4. I can get the two closest points looping through all point in the polygons and use distance functions

Am I doing it the best way?

I tried using QgsSpatialIndex but it is returning the wrong pairs for some reason. Is there any built-in function to return the two closest points directly?

BTW I'm using arcpy, but previously I was using pyqgis.

Best Answer

Ok I could get it done with numpy instead of looping. I've done this way:

def useNumpy(pt1, pt2):
    #The input are two lists of points defining the two polygons of interest
    #Shorten some function names that will be repeated
    nr = numpy.roll
    nt = numpy.tile

    #Give lists as array
    arr1 = numpy.array(pt1)
    arr2 = numpy.array(pt2)
    A0 = arr1[:-1,0]    #array of X for pt1, last point is the same of the first
    A1 = arr1[:-1,1]    #array of Y for pt2

    #Roll arrays to serve as the second point to form lines
    #each [[A0,A1],[B0,B1]] form a line
    B0 = nr(A0,1)
    B1 = nr(A1,1)

    #Do the same for points of the second polygon
    C0 = arr2[:-1,0]
    C1 = arr2[:-1,1]
    D0 = nr(C0,1)
    D1 = nr(C1,1)
    size1 = len(A0)
    size2 = len(C0)

    #Get tiled and repeated arrays to get all possible pairs of points
    A0 = nt(A0, size2)
    A1 = nt(A1, size2)
    B0 = nt(B0, size2)
    B1 = nt(B1, size2)
    C0 = C0.repeat(size1)
    C1 = C1.repeat(size1)
    D0 = D0.repeat(size1)
    D1 = D1.repeat(size1)

    '''
    Throw the arrays into nearestPoint function
    Every pair of nearest points will be either:
     - Points that define the polygons.
     - A point that define a polygon and a projected
     point into a line that define the other polygon.
    '''
    result1 = nearestPoint(A0,A1,B0,B1,C0,C1)
    result2 = nearestPoint(C0,C1,D0,D1,A0,A1)

    #Return the result that is nearest
    if result1[2] < result2[2]:
        return result1
    else:
        return result2

And the nearestPoint function:

def nearestPoint(A0, A1, B0, B1, C0, C1):
    '''
    This function is adapted from another non-vectorized implementation:
    <http://stackoverflow.com/questions/849211/shortest-distance-between-a-point-and-a-line-segment>

    The inputs are numpy arrays:
     - A0: array of X values for point A
     - A1: array of Y values for point A
     - B0: array of X values for point B
     - B1: array of Y values for point B
     - C0: array of X values for point C
     - C1: array of Y values for point C
    '''
    size = A0.size
    AB0 = B0-A0
    AB1 = B1-A1
    AB_squared = (AB0*AB0 + AB1*AB1)
    CA0 = C0-A0
    CA1 = C1-A1
    t = (CA0*AB0+CA1*AB1)/AB_squared
    result = numpy.array([[numpy.nan, numpy.nan]]).repeat(size,0)
    test = t<0
    result[test] = zip(A0[test],A1[test])
    test2 = t>1
    result[test2] = zip(B0[test2],B1[test2])
    test3 = numpy.logical_not(test | test2)
    if sum(test3) != 0:
        result[test3] = zip((A0[test3]+t[test3]*AB0[test3]),(A1[test3]+t[test3]*AB1[test3]))
    dists = ((result[:,0]-C0)**2+(result[:,1]-C1)**2)
    minDist = min(dists)**0.5
    minIndex = dists.argmin()
    minPt1 = (C0[minIndex],C1[minIndex])
    minPt2 = result[minIndex]
    return (minPt1, (minPt2[0],minPt2[1]), minDist)