[GIS] PostGIS doesn’t use spatial index with ST_Intersects

openstreetmappostgispostgresqlqueryspatial-index

I imported the data from openstreetmap using osm2pgsql and copied it another table structure. If I made a query for containing rows using st_intersects, st_contain or someone else, postgis don't use the gist-INDEX. Always a seq-scan will be made. If I filter the rows first using the &&-operator (bbox of geometries), postgis uses the gist-index for the postgis functions. I don't understand why, where is the problem and how can I change this.

Here I've created a simplified table of the original (you can also use a planet_osm_polygon for queries):

CREATE TABLE osm_addr2 (osm_id bigint, geometry geometry);
CREATE INDEX osm_addr2_geom ON osm_addr2 USING gist (geometry);

Here I created a very simple example query:

EXPLAIN ANALYSE SELECT * FROM osm_addr2 AS addr 
WHERE -- addr.geometry && (SELECT geometry FROM osm_addr2 WHERE osm_id=-332537) AND 
  st_intersects(addr.geometry, (SELECT geometry FROM osm_addr2 WHERE osm_id=-332537));

Using st_intersects only, I have the following result (first line only):

Seq Scan on osm_addr2 addr (cost=24.50..336.10 rows=387 width=40) (actual time=0.001..0.001 rows=0 loops=1)

If I uncomment the &&-operator the index will be used (first line only):

Index Scan using osm_addr2_geom on osm_addr2 addr (cost=49.15..57.41 rows=1 width=40) (actual time=0.031..0.031 rows=0 loops=1)

I'm using postgresql 9.3 with postgis 2.1, see postgis_full_version():

POSTGIS="2.1.4 r12966" GEOS="3.3.3-CAPI-1.7.4" PROJ="Rel. 4.7.1, 23 September 2009" GDAL="GDAL 1.9.0, released 2011/12/29" LIBXML="2.8.0" LIBJSON="UNKNOWN" RASTER

I can reproduce this on Ubuntu 14.04 and Debian 7. I hope anyone can help because using the intersection operator will inflate my queries.

Best Answer

I have found that rearranging the query so that the sub-query is at the same level as the initial select, essentially a Cartesian product, but then using the where clause to restrict the records read, will cause the indexes to be used and avoid a full table scan.

SELECT * 
FROM 
   osm_addr2 AS addr, 
   (SELECT geometry FROM osm_addr2 WHERE osm_id=-332537) as addr2
WHERE st_intersects(addr.geometry, addr2.geometry);

EDIT: thanks to MikeT for the link to the relevant docs and to Jakub for the term function inlining.

EDIT 2: I now find it more elegant to use CTE queries for this kind of problem, as they are easier to read than subqueries, and have the same effect as far as making the spatial index get utilized for the spatial intersection.

WITH addr2(geometry) AS 
     (SELECT geometry FROM osm_addr2 WHERE osm_id=-332537) 
   SELECT addr.* 
     FROM 
        osm_addr2 addr, addr2
    WHERE ST_Intersects(addr.geometry, addr2.geometry);