[GIS] Spatial index on PostGIS with ST_MakeEnvelope testing against point geometries (before vacuum full analyse)

extentspostgisspatial-index

I just updated this question as I was able to fix the problem by running a vacuum full analyse on the tables. I tested vacuum alone and vacuum analyse before which did not solve the problem. Not sure what vacuum full analyse does but it fixed this kind of problem.

OP>>>

I have several large PostGIS tables with millions of records. Each record has a two geometry fields. A polygon geometry (Polygon, 3857) and a centroid of the polygon (Point, 4326). I also created a (Point, 3857) geometry for testing.

All geometry fields have a spatial index (GIST) and all tables are vacuumed.

I am trying to select records that are within a lat, lon bounding box which I create with ST_MakeEnvelop(). I test ST_Intersect as well as ST_DWithin with a buffer radius of 0. Testing a bounding box against point geometries does never use the spatial index. While testing bounding box against the polygon geometry does use a spatial index. This seems counter intuitive. As I am not even interested in displaying the geometries but only their centroids.

Is there any way to tweak my index or query to get points that are within / intersect a bounding box?

This query uses a spatial index:

SELECT lon, lat, pop
FROM cj_hx_16k
WHERE
  ST_DWithin(
    ST_MakeEnvelope(106.3, 26.4, 130.8, 35.7, 4326),
    geomcntr, 0);

This is the corresponding description for the explain plan.

Node Type = Seq Scan; Relation Name = cj_hx_16k; Alias = cj_hx_16k; Startup Cost = 0.0; Total Cost = 28343.77; Plan Rows = 1127; Plan Width = 23; Filter = ((geomcntr && '0103000020E610000001000000050000003333333333935A406666666666663A403333333333935A409A99999999D941409A999999995960409A99999999D941409A999999995960406666666666663A403333333333935A406666666666663A40'::geometry) AND ('0103000020E610000001000000050000003333333333935A406666666666663A403333333333935A409A99999999D941409A999999995960409A99999999D941409A999999995960406666666666663A403333333333935A406666666666663A40'::geometry && st_expand(geomcntr, '0'::double precision)) AND _st_dwithin('0103000020E610000001000000050000003333333333935A406666666666663A403333333333935A409A99999999D941409A999999995960409A99999999D941409A999999995960406666666666663A403333333333935A406666666666663A40'::geometry, geomcntr, '0'::double precision));

This query does use the spatial index on the geom field.

SELECT lon, lat, pop
FROM cj_hx_32k
WHERE
  ST_DWithin(
    ST_Transform(
        ST_MakeEnvelope(106.3, 26.4, 130.8, 35.7, 4326),
                 3857), geom, 0);

This is the corresponding explain plan:

Node Type = Bitmap Heap Scan; Relation Name = cj_hx_32k; Alias = cj_hx_32k; Startup Cost = 147.26; Total Cost = 2861.79; Plan Rows = 204; Plan Width = 24; Recheck Cond = (geom && '0103000020110F00000100000005000000EDE4E1BBF5916641444475CC81424741EDE4E1BBF59166418189CBFD963F50414311AAACA9C56B418189CBFD963F50414311AAACA9C56B41444475CC81424741EDE4E1BBF5916641444475CC81424741'::geometry); Filter = (('0103000020110F00000100000005000000EDE4E1BBF5916641444475CC81424741EDE4E1BBF59166418189CBFD963F50414311AAACA9C56B418189CBFD963F50414311AAACA9C56B41444475CC81424741EDE4E1BBF5916641444475CC81424741'::geometry && st_expand(geom, '0'::double precision)) AND _st_dwithin('0103000020110F00000100000005000000EDE4E1BBF5916641444475CC81424741EDE4E1BBF59166418189CBFD963F50414311AAACA9C56B418189CBFD963F50414311AAACA9C56B41444475CC81424741EDE4E1BBF5916641444475CC81424741'::geometry, geom, '0'::double precision));
Node Type = Bitmap Index Scan; Parent Relationship = Outer; Index Name = sidx_cj_hx_32k_geom; Startup Cost = 0.0; Total Cost = 147.21; Plan Rows = 3058; Plan Width = 0; Index Cond = (geom && '0103000020110F00000100000005000000EDE4E1BBF5916641444475CC81424741EDE4E1BBF59166418189CBFD963F50414311AAACA9C56B418189CBFD963F50414311AAACA9C56B41444475CC81424741EDE4E1BBF5916641444475CC81424741'::geometry);

Best Answer

what I meant in my comment above is apply a small buffer to the points, so when you go to intersect the points and the bounding box an index is used.

something like this:

with point_buff as(select st_buffer(geom,.001) geom,lon,lat,pop from cj_hx_32k)

SELECT lon, lat, pop
    FROM point_buff
    WHERE
      ST_intersects(
        ST_Transform(
            ST_MakeEnvelope(106.3, 26.4, 130.8, 35.7, 4326),
                     3857), geom);

ps* I dont know which SRID you are using for the points layer, but if its in feet then use a small measure like I did in my example. If you are using 3857 or 4326 maybe cast to geometry first, then buffer, then case back to whatever main SRID you are using