Firstly few questions:
1. what do you mean by converting roads to nodes? Maybe you meant network graph?
2. Are you able to find at least one way and the trouble is with many ways?
3. Nearest school euclidean or by the roads?
4. What was the work-flow of this conversion form roads to nodes?
5. What are the names of tables and column names in this tables?
6. How big are your tables with students and schools?
7. Is it OK to select nearest school euclidean
Please add information to your question so maybe there will be simplest to help you.
But back to the point:
PG_Routing is not a resolution out of the box since it provides a kind of 'low level' stored procedures. You can't write query like 'select pg_CountMeTheWay(student, scholl), but it can find shortest way if you'll give it source edge id, target edge id and set of edges as graph with cost.
Also function st_DrivingDistance is not finding route, but it's creating set of edges reachable in specified time (e.g. how far you can go in 15 minutes).
Also 2 you need distance from student to nearest school but which is nearest if you don't know the distance? Do you have some kind of relation between schools and students? If not you have to count distance from one student to every school to know which is nearest. In most cases nearest euclidean (in straight line) will be also nearest by the roads, but if you have some special cases in your graph such as river with one bridge you cant be sure.
There are many resolutions of your problem even without this questions to your question, so please try to be more specific.
EDIT
Ok, so hope you have a lot of time :)
Firstly - to know the driving distance you have to (one way or another) find shortest path - there is no smoke without fire...
1. Driving distance is not doing anything different then this, but under another conditions. It's like space-time in military meaning - you have to dig ditch from here to everywhere till 4:00pm.
2. Also as far as I know driving_distance is working like window functions (e.g. row_number() over()) so it's designating cost during calculations, so you can't make him to do calculations for all edges, but designate cost only to few. Of course you can do select from select but this means that distance to every edge will be calculated anyway.
3. pgr_Driving_distance is much more slower then pgr_dijkstra - finding one route between points spaced about 15km is taking on my dataset (all Poland) about 1s (0,5-2,5), driving distance with the same condition will be about 15s
Basing upon this times counting:
1. Shortest paths between all students and scholls (100000x50x2) will last 115 days
2. Shortest paths between all students and scholls with minimal time will last about 46 days
3. Mixed version - Shortest paths between all students and 10 euclidean nearest to them scholls will last about 10 days
4. driving_distance for all students till 15km will last 17 days
5. Short version - Shortest paths between all students and 1 euclidean nearest to them scholl will last about 1 day.
6. Performance depends on numer of rows in edges table (network graph) so less records less time you'll need.
Possible resolution (1):
- Write stored procedure with your pgr_djikstra() with id_source, id_target as input and cost as output (there are many exapmples on network)
- create table with id_student, id_source, id_school, id_target, cost
- Fill it with Id's and nearest edges of network graph
- update it with cost of shortest path using your function
- select for every student record with the lowest cost
Possible resolution (2):
- Write stored procedure with driving_distance with source_id as input and retorn output of driving_distance
- Write stored procedure with geometry as input and nearest edge id as output.
- create tables with students and their nearest edges using above procedure, the same for schools
- create table a as select *, your_proc(nearest_edge_id) as dist from students_nearest.
- create another table b as select * from a where exists id of nearest student and school id
- Select from this table records with student_id, school_id, min(cost) group by student_id, school_id
IMO the best resolution as compromise of time and quality is to find 3 euclidean nearest schools for every student and count for them shortest paths, then choose one - the shortest. With some tuning and only absolutely necessary edges in network graph you can feet in few hours.
I don't have a fully worked out answer, but maybe enough to get you started. These functions might help:
ST_Line_Locate_Point() gives the distance as a portion of the total line length of a point along a line. ST_Line_Interpolate_Point() returns a point geometry for a point at a given distance (again as a portion of the total length) along a line.
What I suggest it to get the point geometry of a location 'new_pts' on your line which are closest to your gpspoints (in a subquery below, formulated as a WITH clause). Then use that point to find the nearest point actually on the line, but at a very slightly smaller distance, so it will be closer to the beginning.
Next use this found point, together with the 'new_pt' to get the azimuth between them. This would more or less represent the tangent to the line at that point.
Then you can compare this angle to the azimuth of your gpspoints to determine whether to add the gpspoint or not.
with new_pts AS (
SELECT ST_ClosestPoint(geom, gpspoints) AS geompt
FROM routes JOIN gpspoints ON ST_Distance(routes.geom, gpspoints.geom)<20
)
SELECT ST_Azimuth(new_pts.geompt,
ST_Line_Interpolate_Point(routes.geom,
ST_Line_Locate_Point(routes.geom, new_pts.geompt)*0.999))
FROM routes JOIN new_pts ON ST_Distance(routes.geom,new_pts.geompt)<20;
Again, not tested, but I hope it will be some help.
Best Answer
How about just using
ST_ClosestPoint
?http://postgis.net/docs/manual-1.5/ST_ClosestPoint.html