PostGIS – How to Speed Up Intersection and Union Operations in PostGIS

intersectperformancepostgisunion

I have three tables: water, wetlands and parcels. I want to update parcel table with area that is covered by water or wetland (they can overlap).

So I want to

  1. Merge water with wetland
  2. Dissolve and multipart -> singlepart
  3. Intersect this with parcels to calculate area:

Number of parcels is ~400 000, water 200 and wetlands 10.

The Query runs for 35 min and update ~2200 rows with correct results. This seems very slow so im probably doing something inoptimal. What can I do to speed it up?

update parcels

set water_or_wetland_area = sub.area_m2

from parcels,
     (select parcels.parcel_id, sum(round(st_area(st_intersection(waterwetland.geom,
     parcel.wkb_geometry))::numeric,2)) as area_m2
     from parcel,
         (select (st_dump(st_union(st_union(water.wkb_geometry,
         wetland.wkb_geometry)))).geom
         from water, wetland
         ) as waterwetland
     where st_intersects(waterwetland.geom, parcels.wkb_geometry)
     group by parcel.parcel_id) as sub     
where parcels.parcel_id = sub.parcel_id

enter image description here

Best Answer

If you want to be efficient with geometries, you should always think about the indexes.

The first think is to be sure that you have an index on parcel:

CREATE INDEX parcels_geom_idx ON parcels USING GIST(geom);

Then create a table with the union of water and wetland:

CREATE TABLE waterwetland AS 
SELECT
    (
      st_dump(
       st_union(water.wkb_geometry, wetland.wkb_geometry)
      )
     ).geom as geom
 FROM water, wetland;

(I removed the second st_union, I don't understand why it is here?) That way you can create an index on it:

CREATE INDEX waterwetland_geom_idx ON waterwetland USING GIST(geom);

Don't forget to analyze to be sure the planner know about your table (especially if you do this inside a script):

ANALYZE waterwetland;
ANALYZE parcels;

Lastly, updates in postgresql are a little slow (because it does actually a lot of things) so if you really need speed and your table is not too big you can simply create another table:

CREATE TABLE parcels2 AS 
SELECT
    parcels.*,
    req.area_m2
FROM
    parcels
LEFT JOIN
    (
SELECT
    parcels.parcel_id as parcel_id, 
    sum(round(st_area(st_intersection(waterwetland.geom,parcels.wkb_geometry))::numeric,2)) as area_m2
FROM 
  parcels, waterwetland
 WHERE st_intersects(waterwetland.geom, parcels.wkb_geometry)
 GROUP BY parcel.parcel_id
) as req
ON parcels.parcel_id=req.parcel_id;

(untested code, but I think this should be the idea)

Related Question