[GIS] Removing unwanted Linestrings from Multilinestring in Postgis

postgispostgresql

I have a Multilinestring layer and I have noticed that in some of my geometries there are unwanted Linestrings. They are composed of two points (start and end) but both are the exact same points. That is wreaking havoc when i try to read this layer from SQLServer.

Example of one geometry:

   "MULTILINESTRING((-3.00712325415935 43.3437538570123,-3.00711888681593 43.3437506773148),
   (-3.00693096810649 43.34361446115,-3.00692660078307 43.3436112814451),
   (-3.00692660078307 43.3436112814451,-3.00692660078307 43.3436112814451),
   (-3.00692660078307 43.3436112814451,-3.0068846515836 43.3436419720614),
   (-3.00692660078307 43.3436112814451,-3.0069222334601 43.34360810174),
   (-3.00691127288165 43.3436001834041,-3.00690690556026 43.3435970036984))"

Notice third Linestring that has the same start and end points. How could I remove these unwanted Linestrings from my geometries?

One approach could be to remove the linestrings which st_length=0 but I don't know how to build the query. I suppose I need to use st_linemerge first to split the multi geometry into single parts?

I need some expert help on this…

Best Answer

You can use ST_Dump to pull out the individual Linestrings, test those for length > 0, and then recreate the Multilinestring using ST_Collect (rather than ST_Union or ST_LineMerge, which involve actual spatial processing calculations, rather than simply combining the underlying geometries into their multi version, as pointed out by @dbaston).

SELECT 
     ST_AsText(ST_Collect(geom)) 
  FROM 
    (SELECT    
        (ST_Dump(
           ST_GeomFromText('MULTILINESTRING(
             (-3.00712325415935 43.3437538570123,-3.00711888681593 43.3437506773148),
             (-3.00693096810649 43.34361446115,-3.00692660078307 43.3436112814451),
             (-3.00692660078307 43.3436112814451,-3.00692660078307 43.3436112814451),
             (-3.00692660078307 43.3436112814451,-3.0068846515836 43.3436419720614),
             (-3.00692660078307 43.3436112814451,-3.0069222334601 43.34360810174),
             (-3.00691127288165 43.3436001834041,-3.00690690556026 43.3435970036984))'))
       ).geom) g
WHERE ST_Length(g.geom) > 0;

which produces:

MULTILINESTRING(
    (-3.00691127288165 43.3436001834041,-3.00690690556026 43.3435970036984),
    (-3.00712325415935 43.3437538570123,-3.00711888681593 43.3437506773148),
    (-3.00692660078307 43.3436112814451,-3.0069222334601 43.34360810174),
    (-3.00693096810649 43.34361446115,-3.00692660078307 43.3436112814451),
    (-3.00692660078307 43.3436112814451,-3.0068846515836 43.3436419720614))

ie, with the line with the repeated coordinates gone. Obviously, you will want to remove the ST_AsText which is just for illustration. Note, ST_Dump is a set returning function, so you have to wrap it in parentheses followed by .geom, to get the geometry back, ie,

(ST_Dump(geom)).geom

which feels a bit strange at first.

You could do something similar to the above using a count of distinct points, but I think, testing on the length as you suggested, is the cleanest.

EDIT. It actually turns out that in this case, you don't even need to test for the length of the linestring, as ST_Union ignores the point when building the MultiLinestring back up. There are other ways to ignore certain geometry types when doing a union, such as testing with ST_GeometryType in the where clause.