PostGIS – How to Perform Nearest Neighbor Calculation in PostGIS?

nearest neighborpostgisproximity

I'm learning how to use PostGIS and spatial databases for analysis. What I am trying to do is perform a calculation to get the distance for the nearest polygon in a file, using the edge calculation, rather than vertices.

Using this answer from Paul Ramsey to Finding minimum edge to edge distance of polygons using ArcGIS Desktop? which is a similar question:

CREATE TABLE mytable_distances AS
SELECT a.id, b.id,
ST_Distance(a.geom::geography,
b.geom::geography) as distance FROM
mytable a, mytable b;

I am attempting to apply it to my spatial database. I don't understand the structure of this query though. I think CREATE TABLE mytable_distances AS creates a table to store the result but after this part I'm lost. Are a and b column names? If so, why would I specify two columns to calculate this?

My table is called TestArea and I have experimented with some basic queries successfully:

SELECT 
  "TestArea".hgt
FROM 
  public."TestArea"
WHERE
  "TestArea".area > 100

The structure of the database in PGAdmin III is as follows, with my table called TestArea. I'm not sure what the nearest neighbor calculation should look like using my column headers (all of these objects are polygons).

enter image description here

Best Answer

a and b are alias table names to the same table. This is effectively a T1 CROSS JOIN T2 in DB-speak. This allows a self-join to say "how close one part is to another" in a single table.

SELECT 
  a.hgt AS a_hgt,
  b.hgt AS b_hgt,
  ST_Distance(a.the_geom, b.the_geom) AS distance_between_a_and_b
FROM 
  public."TestArea" AS a, public."TestArea" AS b
WHERE
  a.gid < b.gid AND a.area > 100 AND b.area > 100

You might want to add another WHERE clause to limit the number of rows, e.g., add AND ST_Distance(a.the_geom, b.the_geom) < 1000.0 so that all distances are less than a kilometer (if you have projected UTM).