GeoPandas – How to Convert a LineString GeoDataFrame to Points GeoDataFrame While Retaining Vertice Order

geopandaslines-to-pointslinestringpython 3

I have a LineString GeoDataframe that I am trying to convert into a Points GeoDataframe, but I want to retain the GroupBy and SortBy features inherent in a LineString (i.e., all the vertices that make up a line are grouped by some ID and sorted in a specific order).

A similar question was asked here, but I don't understand from the answers (1) how to get my groupby/sortby requirement; and (2) why they use a one line function, it seems like there should be a cleaner way.

Below I have an example where I build a LineString from a Points GeoDataFrame, and I am basically trying to decompose it back to Points. In reality, I don't have the original Points GeoDataFrame, I just made one up here so that someone can have an easy copy/paste example to work with (per the questions guidelines).

Build Example LineString GeoDataFrame

%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
from shapely.geometry import LineString
import pyproj
from pyproj import CRS

myid = [1, 1, 1, 2, 2]
myorder = [1, 2, 3, 1, 2]
lat = [36.42, 36.4, 36.32, 36.28, 36.17]
long = [-118.11, -118.12, -118.07, -117.95, -117.95]
df = pd.DataFrame(list(zip(myid, myorder, lat, long)), columns =['myid', 'myorder', 'lat', 'long']) 
gdf_pt = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df['long'], df['lat']))
display(gdf_pt)
gdf_line = gdf_pt.sort_values(by=['myorder']).groupby(['myid'])['geometry'].apply(lambda x: LineString(x.tolist()))
gdf_line = gpd.GeoDataFrame(gdf_line, geometry='geometry')
gdf_line.crs = "EPSG:4326"
display(gdf_line)
ax = gdf_line.plot();
ax.set_aspect('equal')
ax.set_xticklabels(ax.get_xticklabels(), rotation=90);

enter image description here

Attempt
Below follows one of the answers from the linked question. It returns a Pandas Series, and I'm just not sure how to unpack it into a dataframe with GroupBy (based on "myid") and then create a SortBy based on the order.

mypoints = gdf_line.apply(lambda x: [y for y in x['geometry'].coords], axis=1)
print(mypoints)
print(type(mypoints))

enter image description here

System details:
Windows 10
conda 4.8.2
Python 3.8.3
shapely 1.7.0 py38hbf43935_3 conda-forge
pyproj 2.6.1.post1 py38h1dd9442_0 conda-forge

Best Answer

I am not sure if I understood your question clearly.
Anyway, I think this problem will be solved if you make a gdf that keeps the order and ID.
shepely.coords can return the coordinates (point values) of each linestring.
Based on this, you can create a new gdf.
By default, coords return values in the order of Linestring.

myid_list = gdf_line.index.to_list()
repeat_list = [len(line.coords) for line in gdf_line['geometry'].unary_union] #how many points in each Linestring
coords_list = [line.coords for line in gdf_line['geometry'].unary_union]

#make new gdf
gdf = gpd.GeoDataFrame(columns=['myid', 'order', 'geometry'])

for myid, repeat, coords in zip(myid_list, repeat_list, coords_list):
    index_num = gdf.shape[0]
    for i in range(repeat):
        gdf.loc[index_num+i, 'geometry'] = Point(coords[i])
        gdf.loc[index_num+i, 'myid'] = myid

gdf['order'] = range(1, 1+len(df))

#you can use groupby method
gdf.groupby('myid')['geometry'].apply(list)

I think there are other better ways.

UPDATE

By AlexS1 comment

for myid, repeat, coords in zip(myid_list, repeat_list, coords_list):
    index_num = gdf.shape[0]
    for i in range(repeat):
        gdf.loc[index_num+i, 'geometry'] = Point(coords[i])
        gdf.loc[index_num+i, 'myid'] = myid
        gdf.loc[index_num+i, 'order'] = i+1