PostGIS Nearest Neighbors – How to Find Nearest Neighbors Faster Using PostGIS and OpenStreetMap

nearest neighboropenstreetmapoptimizationpostgispostgresql

I currently have a query that calculates the number of useful amenities within a certain radius of each town, using OSM data:

create table places as
select town.name, place, town.way, (
  select count(*)
  from planet_osm_point pub
  where (pub.amenity in ('pub','cafe','restaurant','supermarket','fuel') or pub.shop in ('bakery','convenience')) and pub.way <-> town.way < 5000
) + (
  select count(*)
  from planet_osm_polygon pub
  where (pub.amenity in('pub','cafe','restaurant','supermarket','fuel') or pub.shop in ('bakery','convenience')) and pub.way <-> town.way < 5000
) pubs
from planet_osm_point town where place in ('hamlet','village','town','city');

It works fine, but is very slow, probably because of the high level algorithm: compare every town to every amenity.

Are there any easy ways to optimise this? (Both tables have indexes already.)

"Explain" returns this:

 Seq Scan on planet_osm_point town  (cost=0.00..260497454.31 rows=6490 width=125)
   Filter: (place = ANY ('{hamlet,village,town,city}'::text[]))
   SubPlan 1
     ->  Aggregate  (cost=17614.09..17614.10 rows=1 width=0)
           ->  Seq Scan on planet_osm_point pub  (cost=0.00..17612.32 rows=710 width=0)
                 Filter: (((amenity = ANY ('{pub,cafe,restaurant,supermarket,fuel}'::text[])) OR (shop = ANY ('{bakery,convenience}'::text[]))) AND (((way)::box <-> (town.way)::box) < 5000::double precision))
   SubPlan 2
     ->  Aggregate  (cost=22522.32..22522.33 rows=1 width=0)
           ->  Seq Scan on planet_osm_polygon pub  (cost=0.00..22521.83 rows=197 width=0)
                 Filter: (((amenity = ANY ('{pub,cafe,restaurant,supermarket,fuel}'::text[])) OR (shop = ANY ('{bakery,convenience}'::text[]))) AND (((way)::box <-> (town.way)::box) < 5000::double precision))

Answer

My new query looks like this:

create table places as
select town.name, place, town.way, 
  (select count(*) 
  from planet_osm_point p 
  where p.amenity = 'pub' and st_dwithin(p.way,town.way,5000) as p1) + 
  (select count(*) 
  from planet_osm_polygon p 
  where p.amenity = 'pub' and st_dwithin(p.way,town.way,5000) as p2) pubs
...

Best Answer

Don't use the distance operation unless you actually need the distance.

You can use the ST_DWithin to get geometries within a certain distance.

Right now I don't have a PostGres database to test and give you a SQL query for your data, but have a look at the sample query given on the documentation page

Related Question