R sf Package – How to Split Polyline by Point

pointpolyline-creationrsfsplitting

I'm learning the sf package in R and would like to locate about 450 sites along the nearest reaches of a river network and to obtain the upstream and downstream distances along the reach where the site is located. I've include one simple feature POINT (site) and LINESTRING or polyline (reach) to provide a reproducible example at the bottom of my post.

Given a point located exactly along the polyline, how would I split the polyline at that point to obtain two polylines? I would like to obtain the lengths of the split polylines both upstream and downstream of the point in order to edit a graph representation of the whole river.

A similar question on SE is submitted here but involves splitting a polyline using polygons. My latest attempt with st_split() yields a GEOMETRYCOLLECTION, but it seems that the LINESTRING coordinates are not listed separately:

library(sf)
# Get the site/reach objects from dput()...
split.reach <- st_split(reach$geometry, site$geometry)
> split.reach
Geometry set for 1 feature 
geometry type:  GEOMETRYCOLLECTION
dimension:      XY
bbox:           xmin: 688358.8 ymin: 171876.4 xmax: 688465 ymax: 172171.2
epsg (SRID):    NA
proj4string:    +proj=somerc +lat_0=46.95240555555556 +lon_0=7.439583333333333 +k_0=1 +x_0=600000 +y_0=200000 +ellps=bessel +units=m +no_defs
GEOMETRYCOLLECTION (LINESTRING (688400.1 171876...

To be clear, in this specific problem I'm interested in lengths along an individual river reach (i.e., an individual row or LINESTRING in an 'sf' object) that is split by a point and not along the whole river network. I would also like to avoid writing code that depends on a large number of spatial packages (i.e., stick with sf)

The dput for a site and reach as an example:

site <- structure(list(SiteId = "PF_00183", SampId = "PF_00183 _ 2014-10-15", 
    Longitude.In = 8.59713558862371, Latitude.In = 46.6953718769341, 
    X.In = 688524.841170469, Y.In = 172082.839248429, ReachId = 125010L, 
    Distance = 76.8220253856169, Longitude.Near = 8.59613333686445, 
    Latitude.Near = 46.6954168201561, geometry = structure(list(
        structure(c(688448.125, 172086.703125001), class = c("XY", 
        "POINT", "sfg"))), n_empty = 0L, precision = 0, crs = structure(list(
        epsg = NA_integer_, proj4string = "+proj=somerc +lat_0=46.95240555555556 +lon_0=7.439583333333333 +k_0=1 +x_0=600000 +y_0=200000 +ellps=bessel +units=m +no_defs"), .Names = c("epsg", 
    "proj4string"), class = "crs"), bbox = structure(c(688448.125, 
    172086.703125001, 688448.125, 172086.703125001), .Names = c("xmin", 
    "ymin", "xmax", "ymax"), class = "bbox"), class = c("sfc_POINT", 
    "sfc"))), .Names = c("SiteId", "SampId", "Longitude.In", 
"Latitude.In", "X.In", "Y.In", "ReachId", "Distance", "Longitude.Near", 
"Latitude.Near", "geometry"), row.names = c(NA, -1L), sf_column = "geometry", agr = structure(c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_), class = "factor", .Label = c("constant", 
"aggregate", "identity"), .Names = c("SiteId", "SampId", "Longitude.In", 
"Latitude.In", "X.In", "Y.In", "ReachId", "Distance", "Longitude.Near", 
"Latitude.Near")), class = c("sf", "data.table", "data.frame"
))
reach <- structure(list(FNODE_ = 148558, TNODE_ = 148142, LENGTH = 343.00790968, 
    ReachId = 125010L, geometry = structure(list(structure(c(688400.125, 
    688394.375, 688385.875, 688375.6875, 688363.625, 688359.8125, 
    688358.8125, 688361.125, 688375.6875, 688389.375, 688411.1875, 
    688432.875, 688445.1875, 688448.125, 688445.875, 688446.625, 
    688448.125, 688455.375, 688465, 171876.40625, 171890.40625, 
    171906.203125, 171916.703125, 171929.09375, 171944.703125, 
    171952.59375, 171959.90625, 171980.296875, 171997.09375, 
    172017.796875, 172041.90625, 172063.5, 172086.703125, 172109.90625, 
    172129, 172138.5, 172153.5, 172171.203125), .Dim = c(19L, 
    2L), class = c("XY", "LINESTRING", "sfg"))), class = c("sfc_LINESTRING", 
    "sfc"), precision = 0, bbox = structure(c(688358.8125, 171876.40625, 
    688465, 172171.203125), .Names = c("xmin", "ymin", "xmax", 
    "ymax"), class = "bbox"), crs = structure(list(epsg = NA_integer_, 
        proj4string = "+proj=somerc +lat_0=46.95240555555556 +lon_0=7.439583333333333 +k_0=1 +x_0=600000 +y_0=200000 +ellps=bessel +units=m +no_defs"), .Names = c("epsg", 
    "proj4string"), class = "crs"), n_empty = 0L)), .Names = c("FNODE_", 
"TNODE_", "LENGTH", "ReachId", "geometry"), row.names = 125010L, class = c("sf", 
"data.frame"), sf_column = "geometry", agr = structure(c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_), .Names = c("FNODE_", 
"TNODE_", "LENGTH", "ReachId"), .Label = c("constant", "aggregate", 
"identity"), class = "factor"))

Best Answer

If you snap the point to the line then you can split the line and then extract the parts from the resulting collection. Use a really small tolerance, I don't know how small it needs to be...

> site_snap = st_snap(site, reach, tol=1e-9)
> parts = st_collection_extract(st_split(reach$geometry, site_snap$geometry),"LINESTRING")

parts is now two features, either side of your point:

> parts
Geometry set for 2 features 
geometry type:  LINESTRING
dimension:      XY
bbox:           xmin: 688358.8 ymin: 171876.4 xmax: 688465 ymax: 172171.2
epsg (SRID):    NA
proj4string:    +proj=somerc +lat_0=46.95240555555556 +lon_0=7.439583333333333 +k_0=1 +x_0=600000 +y_0=200000 +ellps=bessel +units=m +no_defs
LINESTRING (688400.1 171876.4, 688394.4 171890....
LINESTRING (688448.1 172086.7, 688445.9 172109....

Your original site doesn't intersect the line, so your split returned nothing. Test with:

> st_intersects(site, reach, sparse=FALSE)
      [,1]
[1,] FALSE
> st_intersects(site_snap, reach, sparse=FALSE)
     [,1]
[1,] TRUE

Your original site is 10^-11 units from the line feature:

> st_distance(site$geom, reach$geom)
Units: m
             [,1]
[1,] 9.550647e-11
Related Question