[GIS] How to find the two furthest points between a group of points

distancepostgispostgresql

I have a table with 20000 records and with those columns Shop, longitude, latitude, geom

I would like to find a way to find the pair of ths farthest shops between a selection of shops

For example we have outletnames : A,B,C,D,E,F,G,H,I,J,K,L,M,N,O…

I would like to know what is the bigest distance between all those shops A B C D and E
A-B = 4 km
A-C- = 2 km
A-D = 1 km
A-E = 1,3 km
B-C = 20 km
C-D = 50 km ….

I would like to have an optimised query that would give me as a result 50 km

Thank you

Best Answer

To measure distances in Km you need to transform your geometry's coordinates from long/lat (spherical) to x/y (planar). In my example I am using UTM zone 35N, EPSG 32635 - you will have to pick something appropriate based on your data's location.

The query below can be optimized by avoiding the st_transform call (e.g. by creating an intermediate table with geometry in EPSG 32635).

The other thing that can be improved is that distances are computed as a query over a cartesian product (the table is joined with itself) so you unnecessarily compute both the distance from A to B and the distance from B to A.

select a.outletname, b.outletname, 
st_distance(st_transform(a.geom, 32635), st_transform(b.geom, 32635)) 
from 
table_name a,
table_name b,
where a.outletname != b.outletname
and a.outletname in ('A', 'B', 'C', 'D')
and b.outletname in ('A', 'B', 'C', 'D')
and st_distance(st_transform(a.geom, 32635), st_transform(b.geom, 32635)) > 50000 
order by st_distance(st_transform(a.geom, 32635), st_transform(b.geom, 32635)) desc;

whuber's idea with the convex hull makes sense when the data set is large enough to prevent computing distances on the cartesian product.

I've checked on my dataset and, as expected, the greatest distance is equal when computed by the two different methods. For reference, the distances between the convex hull points query:

select st_distance(a.geom, b.geom) from
(select (st_dumppoints(st_convexhull(st_collect(st_transform(way, 32635))))).geom from planet_osm_point a where a.amenity='pharmacy' and st_contains((select way from planet_osm_polygon where name='Sector 1'), a.way)) a,
(select (st_dumppoints(st_convexhull(st_collect(st_transform(way, 32635))))).geom from planet_osm_point a where a.amenity='pharmacy' and st_contains((select way from planet_osm_polygon where name='Sector 1'), a.way)) b
order by st_distance(a.geom, b.geom) desc;

There's an additional problem with the convex hull approach - it works on aggregated geometry hence it anonymizes which point belongs to which feature, making it more difficult to compute outlet names.

The solution would probably be to compute the convex hull and then identify (by spatial queries), which feature geometries belong to the convex hull and run the distance query on those features only. E.g.:

The geometry of the first point of the convex hull:

osm=# select (st_dumppoints(st_convexhull(st_collect(way)))).geom from planet_osm_point a where a.amenity='pharmacy' limit 1;
                        geom                        
----------------------------------------------------
 010100002031BF0D00D120685EC2064241D3F3D3FDA9AE5541
(1 row)

The corresponding feature from the table:

osm=# select name, osm_id, way from planet_osm_point a  where a.way = '010100002031BF0D00D120685EC2064241D3F3D3FDA9AE5541';
   name    |   osm_id   |                        way                         
-----------+------------+----------------------------------------------------
 Dentafarm | 1395848741 | 010100002031BF0D00D120685EC2064241D3F3D3FDA9AE5541
(1 row)