PostGIS – Optimizing Point in Polygon Query for Large Datasets

postgispostgresql

I have 150 million points in a point table and would like to find the few points lying outside a given polygon geometry. I know that 99.9% of the points are within the polygon geometry. I am interested in finding the few points which lie outside the polygon.

My present best query using indexed PostGIS tables takes about 30 minutes to complete. Is there a way to optimize the following query knowing that most of the points are within the polygon (border)?

SELECT COUNT(*) 
FROM italy_points pt
JOIN borders poly
ON ST_WITHIN (pt.the_geom, poly.geom)
WHERE poly.iso3 = 'ITA'; 

The polygon is basically the admin 0 border of Italy.
Vertices – 405,000. Parts – 510. The envelope is much larger than the polygon (The polygon covers 24% of the envelope)

Best Answer

Use ST_Subdivide to cut your polygon into smaller polygons, save them to a table and create a spatial index. Then do your query on the gridded polygons.

Without that, spatial indexing does not provide any advantages in your case (only 1 polygon of interest).

Related Question