[GIS] PostGIS union multiple tables, big dataset, faster approach

bufferpostgisunion

My goal is simple. I want to union a given set of shapefiles. In the shapefile are stored polygons. The data bounding box is Germany, so the data is relatively big. I tried to use all functions implemented in QGIS. Either they failed or the precessing time is just too long to get any results. I read a lot of question throughout this forum and I set up a PostGIS Database.

My query is:

 CREATE TABLE dlm_union_2 AS
  SELECT
  a.gid AS ID,
  ST_UNION(a.geom,b.geom) AS geom 
  FROM sie01_f a, sie02_f b

I added the Selection of the a.gid since I have no objects returned when I export the table via pgSHAPELOADER. I want to use more than two tables in one expression and is there a way to get this faster? With only two tables my processing time is now about 800,000 ms and the progress isn't finished yet.
After that I want to buffer the merged regions and would like to union it again afterwards. I have a lot of shapes that I have to union at different states of my project.

I am working on WIN 7, 4GB ram, Intel Core i5 M520. PostGIS and PostgreSQL should be up to date, because I installed the components today. Every suggestion are highly appreciated.

UPDATE: I have created a buffer on a multilinestring table. Processing time was about 300,000 ms what is actually great compared to functions implemented in QGIS. My problem is that I need to union the result for further analysis. I'am using ST_UNION with a single table with polygon geometry input. Objectsize is around 300,000 polygon features. At this point my processing time is 2,800,000 ms and I don't know when it will finish. Is there any way to get such a calculation faster?

After processing time 5,319,543 ms the query aborted with following exception. This was the try to use ST_UNION with one table including 300,000 polygons.

ERROR:  GEOSUnaryUnion: std::bad_alloc

********** Fehler **********

ERROR: GEOSUnaryUnion: std::bad_alloc
SQL Status:XX000

UPDATE:

According to Mattmakesmaps I tried the way of ST_Buffer(ST_Collect(geom),0) instead of using the UNION command. With the errors I am posted above I was not able the handle the whole extend of the data. So I split the problem into smaller ones. In my case I used the state-borders in Germany to divide my problem into 16 smaller ones. Mattmakesmaps solution works fine here. Recently I am try to use the ST_Simplify command to make the dimension of the polygons smaller giving me a chance to process all of the data at ones. I will keep you up to date.

Best Answer

For what it's worth, the ST_Collect() function is much faster then a call to ST_Union().

The speed increase is mainly attributed to the fact that ST_Collect() returns a Geometry Collection object, and doesn't try to dissolve overlapping geometries.

You can create a single, OGC-compliant geometry (e.g. a POLYGON or MULTIPOLYGON) by wrapping ST_Collect() in a 0-width buffer, e.g. ST_Buffer(ST_Collect(geometry), 0).

I've used it to generate global-scale label points for OpenStreetMap water features (a rather large dataset). With proper GIST indexes as noted by previous respondents, it's a pretty fast operation.

Good Luck!

Related Question