GeoPandas – How to Split Line by Nearest Points

geopandasshapelysnappingsplitting

I have two GeoPandas Dataframes.
One is a combination of linestrings, when plotted is a single line. The other GeoDataFrame contains rows where each row is a unique point close to to the first line.

Now I would like to split the line using the locations of the points returning a GeoDataFrame.

Where my input looks like this:

enter image description here

Best Answer

First make sure you union your GeoDataFrames into a MultiLineString and MultiPoint

line = gdf_line.geometry.unary_union
coords = gdf_point.geometry.unary_union

Using shapely.ops.snap and shapely.ops.split it is possible to snap the points to the line (with a given tolerance) and use this to split the line. Result is a GeometryCollection

split(line, snap(coords, line, tolerance=1)

To combine this and return a GeoDataFrame use the following function:

import geopandas as gpd
from shapely.ops import split, snap


def split_line_by_nearest_points(gdf_line, gdf_points, tolerance):
    """
    Split the union of lines with the union of points resulting 
    Parameters
    ----------
    gdf_line : geoDataFrame
        geodataframe with multiple rows of connecting line segments
    gdf_points : geoDataFrame
        geodataframe with multiple rows of single points

    Returns
    -------
    gdf_segments : geoDataFrame
        geodataframe of segments
    """

    # union all geometries
    line = gdf_line.geometry.unary_union
    coords = gdf_points.geometry.unary_union

    # snap and split coords on line
    # returns GeometryCollection
    split_line = split(line, snap(coords, line, tolerance))

    # transform Geometry Collection to GeoDataFrame
    segments = [feature for feature in split_line]

    gdf_segments = gpd.GeoDataFrame(
        list(range(len(segments))), geometry=segments)
    gdf_segments.columns = ['index', 'geometry']

    return gdf_segments

Which can be plotted as follows (where I find the tolerance variable still trial and error):

import matplotlib.pyplot as plt
fig, ax = plt.subplots()
gdf_line.plot(ax=ax, lw=6, color='gray')
gdf_segments.plot(ax=ax, column='index', lw=3, cmap='Paired')
gdf_points.plot(ax=ax, zorder=3)

enter image description here

--EDIT

the snap function is not similar as a nearest_point query. I end up using the function https://github.com/ojdo/python-tools/blob/master/shapelytools.py#L144 from https://github.com/ojdo/python-tools that provides many interesting functions for shapely

Related Question