You can test for a self-intersecting linestring with ST_IsSimple(geom)
:
SELECT ST_IsSimple('LINESTRING (50 50, 150 150, 50 150, 150 50)');
st_issimple
-------------
f
(1 row)
Above image and below caption are from JTS TestBuilder (click "Simple?")
Self-intersection at POINT ( 100.0 100.0 )
This can be fixed with ST_UnaryUnion(geom)
(since PostGIS 2.0), which returns a valid/simple three piece multilinestring:
MULTILINESTRING((50 50, 100 100),
(100 100, 150 150, 50 150, 100 100),
(100 100, 150 50))
If I understand correct, your question consists of 2 parts:
1. How to get many linestring to merge to 1
2. How to get the point 1000 km further down the road from point 1
Part 1 needs some more explanation from you. It seems you already did a linemerge and that works for me as well with the data you give. Maybe you can post a new stackexchange question with what precisely is not working for you. Merging linestrings from a line spaghetti can be difficult!
Part 2 I'll answer here.
Your query was already well on the way but needed some cleanup and the use of the 'measurement' option in postgis. One more important part of it is that you first have to reproject your dataset to something else than lat/lon, because you would get distance in degrees otherwise. Here is what I made of it:
WITH lines AS (
--CREATE THE INITIAL DATASET WITH LINESTRINGS
SELECT ST_LineFromText('LINESTRING(-1.3326397 50.9174932,-1.3319842 50.9166939)',4326) geom
UNION ALL
SELECT ST_LineFromText('LINESTRING(-1.3333297 50.9183351,-1.3332187 50.9182141,-1.3330436 50.9179954,-1.3326397 50.9174932)',4326) geom
UNION ALL
SELECT ST_LineFromText('LINESTRING(-1.3338982 50.9186728,-1.333581 50.9185143,-1.3333297 50.9183351)',4326) geom
UNION ALL
SELECT ST_LineFromText('LINESTRING(-1.3341242 50.9187719,-1.3338982 50.9186728)',4326) geom
UNION ALL
SELECT ST_LineFromText('LINESTRING(-1.335291 50.919197,-1.3346072 50.918916,-1.3341242 50.9187719)',4326) geom
)
,route AS (
--MERGE THEM INTO ONE LINE AND TRANSFROM TO MERCATOR PROJECTION
SELECT ST_Transform(ST_MakeLine(geom),900913) geom FROM lines
)
,routem AS (
--ADD A MEASUREMENT ALONG THE LINE FROM 0 TO THE TOTAL LENGTH AND STORE THE LENGTH (for later use)
SELECT ST_AddMeasure(route.geom, 0, ST_Length(route.geom)) geom, ST_Length(route.geom) l FROM route
)
,point AS (
--CREATE OUR INITIAL POINT AND TRANSFORM TO MERCATOR PROJECTION
SELECT ST_Transform(ST_PointFromText('POINT(-1.3333297 50.9183351)',4326),900913) geom
)
SELECT
ST_AsText(point.geom) as startpoint,
ST_AsText(
--GET THE SECOND POINT ACCORDING TO DISTANCE ALONG LINE
ST_LocateAlong(routem.geom,
--GET THE DISTANCE OF THE FIRST POINT AND ADD SOME DISTANCE (here 100m)
ST_LineLocatePoint(routem.geom, point.geom) * l + 100)
) endpoint
FROM
routem,
point
I'm sure there is more than 1 way to get this done, but I believe this is the easiest and shortest. The linear referencing tools in postgis can be a little bit confusing, since some of them work with fractions and others with real measurements. Reading the docs a couple of times helped for me ;)
Hope it serves you.
Best, Tom
Best Answer
This is a pure pyqgis implementation of Hausdorff Distance, solely for comparing polylines. If you find the wikipedia page hard to understand, try to think of it this way: it is a distance that lies somewhere between the minimum and maximum distance between two lines, but it is not a simple statistical mean or median distance. It tends to be somewhat higher than the mean (min+max)/2, really a neat mathematical way of calculating the "greatest of all the distances from a point in one set[line] to the closest point in the other set[line]". And yes, it's an actual distance between two nodes on both lines!
And to answer the OP's question, finding the closest/most-similar linestring would probably work in a loop like so:
Credits to Anita Graser's script which the above code was based off, and which can be found on github. The script here is probably more convoluted than Anita's, but it does not depend on scipy.spatial whose installation didn't work for me through OSGEO4W. I would actually recommend the linked numpy method if you can use it, much shorter and cleaner code.