PostGIS ST_Union – How to Efficiently ST_Union All Overlapping Polygons in a Single Table?

postgis

My goal is take a single table and st_union all polygons that are either touching or close to each other into single polygons

I'm a C# developer who is starting to learn about PostGIS. Using the code below, I was able to accomplish this, but it seems inefficient, and there's much to PostGIS that is new to me.

From my initial attempt (still in comments), I was able to reduce the iterations by using array_agg with ST_UNION instead of unioning just polys at a time.

I wind up with 133 polys from my orig 173.

sql = "DROP TABLE IF Exists tmpTable; create table tmpTable ( ID varchar(50), Geom  geometry(Geometry,4326), Touchin varchar(50) ); create index idx_tmp on tmpTable using GIST(Geom); ";
                CommandText = sql;
                ExecuteNonQuery();

                sql = "";
                for (int i = 0; i < infos.Count(); i++)
                {
                    sql += "INSERT INTO tmpTable SELECT '" + infos[i].ID + "', ST_GeomFromText('" + infos[i].wkt + "', 4326), '0';";
                }
                CommandText = sql;
                ExecuteNonQuery();

                CommandText = "update tmpTable set touchin = (select id from tmpTable as t where st_intersects(st_buffer(geom, 0.0001), (select geom from tmpTable as t2 where t2.ID = tmpTable.ID ) ) and t.ID <> tmpTable.ID limit 1)";
                ExecuteNonQuery();

                CommandText = "select count(*) from tmpTable where touchin is not null";
                long touching = (long)ExecuteScalar();
                string thisId = "";
                // string otherId = "";
                while (touching > 0)
                {
                    CommandText = "select touchin, count(*)  from tmpTable where touchin is not null group by touchin order by 2 desc limit 1";
                    //CommandText = "select id, touchin from tmpTable where touchin is not null";
                    using (var prdr = ExecuteReader())
                    {
                        CommandText = "";
                        if (prdr.Read())
                        {
                            thisId = prdr.GetString(0);
                             // otherID = prdr.GetString(1);
                            CommandText = @"update tmpTable set geom = st_union(unioned) 
                                from (select array_agg(geom) as unioned from tmpTable where touchin = '" + thisId + "' or id = '" + thisId + @"') as data
                                where id = '" + thisId + "'";
                             // CommandText = "update tmpTable set geom = st_union(geom, (select geom from tmpTable where ID = '" + otherId + "')) where id = '" + thisId + "'";
                        }
                    }

                    if (!string.IsNullOrEmpty(CommandText))
                    {
                        ExecuteNonQuery();
                        //CommandText = "update tmpTable set geom = null, touchin = null where ID = '" + otherId + "'";
                        CommandText = "update tmpTable set geom = null, touchin = null where touchin = '" + thisId + "'";
                        ExecuteNonQuery();                            
                    }

                    CommandText = "update tmpTable set touchin = (select id from tmpTable as t where st_intersects(st_buffer(geom, 0.0001), (select geom from tmpTable as t2 where t2.ID = tmpTable.ID ) ) and t.ID <> tmpTable.ID limit 1)";
                    ExecuteNonQuery();

                    CommandText = "select count(*) from tmpTable where touchin is not null";
                    touching = (long)ExecuteScalar();
                }

Here's sample of the dataset I am using:

INSERT INTO tmpTable SELECT '872538', ST_GeomFromText('POLYGON((-101.455035985 26.8835084441,-101.455035985 26.8924915559,-101.444964015 26.8924915559,-101.444964015 26.8835084441,-101.455035985 26.8835084441))', 4326), '0';
INSERT INTO tmpTable SELECT '872550', ST_GeomFromText('POLYGON((-93.9484752173 46.0755084441,-93.9484752173 46.0844915559,-93.9355247827 46.0844915559,-93.9355247827 46.0755084441,-93.9484752173 46.0755084441))', 4326), '0';
INSERT INTO tmpTable SELECT '872552', ST_GeomFromText('POLYGON((-116.060688575 47.8105084441,-116.060688575 47.8194915559,-116.047311425 47.8194915559,-116.047311425 47.8105084441,-116.060688575 47.8105084441))', 4326), '0';
INSERT INTO tmpTable SELECT '872553', ST_GeomFromText('POLYGON((-116.043688832 47.8125084441,-116.043688832 47.8214915559,-116.030311168 47.8214915559,-116.030311168 47.8125084441,-116.043688832 47.8125084441))', 4326), '0';
INSERT INTO tmpTable SELECT '872557', ST_GeomFromText('POLYGON((-80.6380222359 26.5725084441,-80.6380222359 26.5814915559,-80.6279777641 26.5814915559,-80.6279777641 26.5725084441,-80.6380222359 26.5725084441))', 4326), '0';
INSERT INTO tmpTable SELECT '872558', ST_GeomFromText('POLYGON((-80.6520223675 26.5755084441,-80.6520223675 26.5844915559,-80.6419776325 26.5844915559,-80.6419776325 26.5755084441,-80.6520223675 26.5755084441))', 4326), '0';
INSERT INTO tmpTable SELECT '872559', ST_GeomFromText('POLYGON((-80.6400224991 26.5785084441,-80.6400224991 26.5874915559,-80.6299775009 26.5874915559,-80.6299775009 26.5785084441,-80.6400224991 26.5785084441))', 4326), '0';
INSERT INTO tmpTable SELECT '872560', ST_GeomFromText('POLYGON((-80.6530226307 26.5815084441,-80.6530226307 26.5904915559,-80.6429773693 26.5904915559,-80.6429773693 26.5815084441,-80.6530226307 26.5815084441))', 4326), '0';
INSERT INTO tmpTable SELECT '872568', ST_GeomFromText('POLYGON((-90.7892258584 30.7365084441,-90.7892258584 30.7454915559,-90.7787741416 30.7454915559,-90.7787741416 30.7365084441,-90.7892258584 30.7365084441))', 4326), '0';
INSERT INTO tmpTable SELECT '872569', ST_GeomFromText('POLYGON((-90.7832259127 30.7375084441,-90.7832259127 30.7464915559,-90.7727740873 30.7464915559,-90.7727740873 30.7375084441,-90.7832259127 30.7375084441))', 4326), '0';

Best Answer

The simplest way would be to ST_Union the entire table:

SELECT ST_Union(geom) FROM tmpTable;

This will give you one huge MultiPolygon, which probably isn't what you want. You can get the individual dissolved components with ST_Dump. So:

SELECT (ST_Dump(geom)).geom FROM (SELECT ST_Union(geom) AS geom FROM tmpTable) sq;

This gives you a separate polygon for each set of touching inputs, but groups of inputs that were separated by a short distance will remain as separate geometries. If you have access to PostGIS 2.2.0rc1, you can merge geometries that are close together into a single GeometryCollection using ST_ClusterWithin:

SELECT unnest(ST_ClusterWithin(geom, 0.0001)) AS grp FROM tmpTable;

If you want Polygons within the GeometryCollection to be dissolved, you can run ST_UnaryUnion on each GeometryCollection in the result, like:

SELECT ST_UnaryUnion(grp) FROM
(SELECT unnest(ST_ClusterWithin(geom, 0.0001)) AS grp FROM tmpTable) sq;
Related Question