You can use a recursive query to explore nearest neighbor of each point starting from each detected end of lines you want to build.
Prerequisites : prepare a postgis layer with your points and another with a single Multi-linestring object containing your roads. The two layers must be on the same CRS. Here is the code for the test data-set I created, please modify it as needed. (Tested on postgres 9.2 and postgis 2.1)
WITH RECURSIVE
points as (SELECT id, st_transform((st_dump(wkb_geometry)).geom,2154) as geom, my_comment as com FROM mypoints),
roads as (SELECT st_transform(ST_union(wkb_geometry),2154) as geom from highway),
Here are the steps:
Generate for each point the list of every neighbors and theirs distance that meet theses three criteria.
- Distance must not exceed a user defined threshold (this will avoid linking to isolated point)
graph_full as (
SELECT a.id, b.id as link_id, a.com, st_makeline(a.geom,b.geom) as geom, st_distance(a.geom,b.geom) as distance
FROM points a
LEFT JOIN points b ON a.id<>b.id
WHERE st_distance(a.geom,b.geom) <= 15
),
- Direct path must not cross a road
graph as (
SELECt graph_full.*
FROM graph_full RIGHT JOIN
roads ON st_intersects(graph_full.geom,roads.geom) = false
),
Distance must not exceed a user defined ratio of the distance from the nearest neighbor (this should accommodate better to irregular digitalization than fixed distance) This part was actually too hard to implement, sticked to fixed search radius
Let's call this table "the graph"
Select end of line point by joining to the graph and keeping only point that have exactly one entry in the graph.
eol as (
SELECT points.* FROM
points JOIN
(SELECT id, count(*) FROM graph
GROUP BY id
HAVING count(*)= 1) sel
ON points.id = sel.id),
Let's call this table "eol" (end of line)
easy? that the reward for doing a great graph but hold-on things will go crazy at next step
Set up a recursive query that will cycle from neighbors to neighbors starting from each eol
- Initialize the recursive query using eol table and adding a counter for the depth, an aggregator for the path, and a geometry constructor to build the lines
- Move to next iteration by switching to nearest neighbor using the graph and checking that you never go backward using the path
- After the iteration is finished keep only the longest path for each starting point (if your dataset include potential intersection between expect lines that part would need more conditions)
recurse_eol (id, link_id, depth, path, start_id, geom) AS (--initialisation
SELECT id, link_id, depth, path, start_id, geom FROM (
SELECT eol.id, graph.link_id,1 as depth,
ARRAY[eol.id, graph.link_id] as path,
eol.id as start_id,
graph.geom as geom,
(row_number() OVER (PARTITION BY eol.id ORDER BY distance asc))=1 as test
FROM eol JOIn graph ON eol.id = graph.id
) foo
WHERE test = true
UNION ALL ---here start the recursive part
SELECT id, link_id, depth, path, start_id, geom FROM (
SELECT graph.id, graph.link_id, r.depth+1 as depth,
path || graph.link_id as path,
r.start_id,
ST_union(r.geom,graph.geom) as geom,
(row_number() OVER (PARTITION BY r.id ORDER BY distance asc))=1 as test
FROM recurse_eol r JOIN graph ON r.link_id = graph.id AND NOT graph.link_id = ANY(path)) foo
WHERE test = true AND depth < 1000), --this last line is a safe guard to stop recurring after 1000 run adapt it as needed
Let's call this table "recurse_eol"
Keep only longest line for each start point and remove every exact duplicate path
Example : paths 1,2,3,5 AND 5,3,2,1 are the same line discovered by it's two differents "end of line"
result as (SELECT start_id, path, depth, geom FROM
(SELECT *,
row_number() OVER (PARTITION BY array(SELECT * FROM unnest(path) ORDER BY 1))=1 as test_duplicate,
(max(depth) OVER (PARTITION BY start_id))=depth as test_depth
FROM recurse_eol) foo
WHERE test_depth = true AND test_duplicate = true)
SELECT * FROM result
Manually checks remaining errors (isolated points, overlapping lines, weirdly shaped street)
Updated as promised, I still can't figure out why sometimes recursive query don't give exact same result when starting from opposite eol of a same line so some duplicate may remain in result layer as of now.
Feel free to ask I totally get that this code need more comments. Here is the full query:
WITH RECURSIVE
points as (SELECT id, st_transform((st_dump(wkb_geometry)).geom,2154) as geom, my_comment as com FROM mypoints),
roads as (SELECT st_transform(ST_union(wkb_geometry),2154) as geom from highway),
graph_full as (
SELECT a.id, b.id as link_id, a.com, st_makeline(a.geom,b.geom) as geom, st_distance(a.geom,b.geom) as distance
FROM points a
LEFT JOIN points b ON a.id<>b.id
WHERE st_distance(a.geom,b.geom) <= 15
),
graph as (
SELECt graph_full.*
FROM graph_full RIGHT JOIN
roads ON st_intersects(graph_full.geom,roads.geom) = false
),
eol as (
SELECT points.* FROM
points JOIN
(SELECT id, count(*) FROM graph
GROUP BY id
HAVING count(*)= 1) sel
ON points.id = sel.id),
recurse_eol (id, link_id, depth, path, start_id, geom) AS (
SELECT id, link_id, depth, path, start_id, geom FROM (
SELECT eol.id, graph.link_id,1 as depth,
ARRAY[eol.id, graph.link_id] as path,
eol.id as start_id,
graph.geom as geom,
(row_number() OVER (PARTITION BY eol.id ORDER BY distance asc))=1 as test
FROM eol JOIn graph ON eol.id = graph.id
) foo
WHERE test = true
UNION ALL
SELECT id, link_id, depth, path, start_id, geom FROM (
SELECT graph.id, graph.link_id, r.depth+1 as depth,
path || graph.link_id as path,
r.start_id,
ST_union(r.geom,graph.geom) as geom,
(row_number() OVER (PARTITION BY r.id ORDER BY distance asc))=1 as test
FROM recurse_eol r JOIN graph ON r.link_id = graph.id AND NOT graph.link_id = ANY(path)) foo
WHERE test = true AND depth < 1000),
result as (SELECT start_id, path, depth, geom FROM
(SELECT *,
row_number() OVER (PARTITION BY array(SELECT * FROM unnest(path) ORDER BY 1))=1 as test_duplicate,
(max(depth) OVER (PARTITION BY start_id))=depth as test_depth
FROM recurse_eol) foo
WHERE test_depth = true AND test_duplicate = true)
SELECT * FROM result
Best Answer
You could use Select By Location with your counties as your target layer and your line feature as your source layer. Your selection method would be features that intersect the source layer feature.
If you don't already have an existing line shapefile, you can create your own by first creating a new shapefile or featureclass using ArcCatalog, adding it into ArcMap, and then following these guidlines for editing features. Because your straight line is a theoretical line for your analysis, creating it yourself should be ok.
Once you have your line feature and county shapefile, run the select by location as I mentioned above. Once your features are selected, you can right click your county shapefile in the table of contents and choose Data -> Export Data. Make sure that you have the option to export only selected features checked. Your exported shapefile or featureclass will be just the counties that your line intersects.