[GIS] Similarity between two or more trajectories

postgispostgresqlsimilarity

I have the data of trucks (http://www.chorochronos.org/).

This data are gps coordinates of multiple trajectories of trucks in Athens.

I have to calculate the similarity between the trajetories, in order to delete those that are very similar!

Problem:

Red And Green are similar, but blue, black and (red or green) are different trajectories.
I want to delete one of the similares, red or green.

Data are in points (geometry , lat and long , x and y)(coordinates gps), the image are examples of trajectories

Best Answer

A really easy, but not fantastic measure is to get the Hausdorff distance between each combination, which is done with the ST_HausdorffDistance function. Using approximate LineStrings from your figure, these are all shown in blue, and the Hausdorff distance is shown for one of the pairs of lines in red:

Hausdorff distance

And the query to sort the 6 combinations in descending order:

WITH data AS (
  SELECT 'blue' AS name, 'LINESTRING (60 200, 110 290, 200 320, 330 320, 430 240, 450 200)'::geometry AS geom
  UNION SELECT 'black', 'LINESTRING (60 200, 120 270, 235 297, 295 207, 450 200)'::geometry
  UNION SELECT 'green', 'LINESTRING (60 200, 280 190, 450 200)'::geometry
  UNION SELECT 'red', 'LINESTRING (60 200, 150 210, 257 195, 360 210, 430 190, 450 200)'::geometry)
SELECT a.name || ' <-> ' || b.name AS compare, ST_HausdorffDistance(a.geom, b.geom)
FROM data a, data b WHERE a.name < b.name
ORDER BY ST_HausdorffDistance(a.geom, b.geom) DESC;

     compare     | st_hausdorffdistance
-----------------+----------------------
 blue <-> green  |                  130
 blue <-> red    |                  125
 black <-> blue  |     110.102502131467
 black <-> green |     104.846289061163
 black <-> red   |     97.9580173908678
 green <-> red   |     15.2677257073823
(6 rows)

So it works fine for this example, but it isn't a great or robust technique for clustering lines, since the only metric is the single point with the greatest distance, rather than comparing the differences of complete lines. There are much better methods, but they will be more complicated.

Related Question