[GIS] Calculating shortest path between each lat/lon point in one table and each lan/lon point in another in PostgreSQL

dijkstraosm2popgroutingpostgresqlrouting

I have two tables, both of them have only 2 columns. The first one contain only latitude and longitude coordinates of my source points and the other one only latitude and longitude coordinates of my target points. For each source point I have to find the closest (along the streets) target point (and tell what's the distance). I have already developed a function give_me_cost, that finds the closest node and performs the routing:

create or replace function nearest_node (double precision, double precision) returns 

integer as
$$
SELECT id::integer as source_id FROM roads_vertices_pgr
                        ORDER BY the_geom <-> ST_GeometryFromText('POINT('
                        ||$1|| ' ' ||$2||')',4326) LIMIT 1;
$$ language sql;

CREATE OR REPLACE FUNCTION give_me_cost(double precision, double precision, double precision, double precision)
  RETURNS decimal(8,8) AS
$BODY$
with subset as (
SELECT seq, id1 AS node, id2 AS edge, cost
  FROM pgr_dijkstra(
    'SELECT gid as id, source, target, st_length(geom) as cost FROM roads',
    nearest_node($1,$2),nearest_node($3,$4), FALSE, False
  ) as di
)
select sum(cost)::decimal (8,8) from subset

$BODY$
  LANGUAGE sql VOLATILE;

Now I have to use it in some kind of loop, but don't know how. The most preferable solution would be the function, where I can input my source table and my target table and get an output of a table with sources, the closest targets and the distances between them.

Best Answer

Read this answer to Nearest facility with pgRouting which is a quite similar case. If it will not then some information that would help is:

  • Size of your source and target tables
  • Size of your table with road network
  • Whether it is OK to find nearest point euclidean and then count road distance

One of resolutions is (pseudo-code):

Create table with distance from every source to every target:

Create table distances_tmp as
SELECT
  sources.id as source_id
  nearest_node(sources.x, sources.y) as source_node
  targets.id as target_id
  nearest_node(targets.x, targets.y) as target_node
  give_me_cost(sources.x, sources.y,targets.x, targets.y) as cost
FROM
  sources
JOIN
  targets on 1=1

Create table with only lowest costs

Create table distances as
select * 
from distances_tmp d1
where not exists (select 1 from distances_tmp d2 where d2.cost < d1.cost)

I wrote it in two steps cause it's easier to understand it, but there is some ways to join this two queries into one (eg. select from select or WITH clause)

Also consider changes in nearest_node function - nearest vertex of nearest edge is not always best place to start (or stop) routing if nearest edge is one-way street, so your cost will not always be accurate.

Related Question