Python GeoPandas – Converting List of Data Points with Variable CRS to WGS 84

coordinate systemgeopandaspandaspython

I have a file, metadata.csv, containing a list of N sensors, and their location:

sensor_id, pos_east, pos_north, crs
sensor1, 109435.44,  1157776.64, 5110
sensor2, 254194,     6593776,    25833
sensor3, 109444.23,  1157734.32, 5110
sensorN, .....,      .....,      ....

I need to create new columns containing the latitude and longitude for each of the sensors in WGS 84.

So far I'm able to do the conversion as long as I assume the whole file has the same CRS, which obviously isn't the case.

My function so far:

  def convert_meta_crs(metadata_file):
    """Converts the Coordinate Reference System (CRS) in the input metadata file from 
    whatever it is given in, to WGS84 (EPSG:4326)"""
    
    df = pd.read_csv(metadata_file)
    df["geometry"] = df.apply(lambda x: Point(x["pos_east"], x["pos_north"]), axis=1)
    gdf = gpd.GeoDataFrame(df, geometry='geometry')

    print(gdf["coordinate_system"])
    gdf["geometry"] = gpd.points_from_xy(gdf["pos_east"], gdf["pos_north"], crs=5110)
    gdf = gdf.to_crs(epsg=4326)
    gdf.to_csv(metadata_file_4326, index=False)

I've attempted setting:
gdf["geometry"] = gpd.points_from_xy(gdf["pos_east"], gdf["pos_north"], crs=gdf["crs"])

But this yields the following error:

Traceback (most recent call last):
File "/code/./main.py", line 95, in <module>
  convert_meta_crs(metadata_file)
File "/code/./main.py", line 54, in convert_meta_crs 
  gdf["geometry"] =   gpd.points_from_xy(gdf["pos_east"], gdf["pos_north"], crs=gdf["crs"])
File "/usr/local/lib/python3.10/site-packages/geopandas/array.py", line 261, in points_from_xy
  return GeometryArray(vectorized.points_from_xy(x, y, z), crs=crs)
File "/usr/local/lib/python3.10/site-packages/geopandas/array.py", line 289, in __init__
  self.crs = crs
File "/usr/local/lib/python3.10/site-packages/geopandas/array.py", line 339, in crs
  self._crs = None if not value else CRS.from_user_input(value)
File "/usr/local/lib/python3.10/site-packages/pandas/core/generic.py", line 1527, in __nonzero__
  raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Any tips on how I can do this, without splitting the file into N dataframes, run the CRS conversion on them individually, and then merge them back into one dataframe?

I've not been able to find anything in the docs or online.

Python version: 3.10
Geopandas version: 0.12.2

Update:
BERA's answer worked and i ended up with the following:

def convert_meta_crs(metadata_file):
  """Converts the Coordinate Reference System (CRS) for each of the 
  elements in the input metadata file from whatever it is given 
  in, to WGS84 (EPSG:4326)"""

  df = pd.read_csv(metadata_file)

  frames = []
  for crs, subframe in df.groupby("crs"):
    subframe["geometry"]= gpd.points_from_xy(x=subframe["pos_east"], y=subframe["pos_north"], crs=crs)
    gdf = gpd.GeoDataFrame(data=subframe, geometry="geometry")
    gdf["lat"] = gdf.to_crs(4326).geometry.y
    gdf["lon"] = gdf.to_crs(4326).geometry.x
    frames.append(gdf)

  df2 = pd.concat([frame.drop(columns="geometry") for frame in frames])

  df2["geometry"] = df2.apply(lambda x: Point(x["lon"], x["lat"]), axis=1)
  del df2["lat"]
  del df2["lon"]

  gdf = gpd.GeoDataFrame(df2, geometry='geometry')
  gdf.to_csv(metadata_file_4326, index=False)

Best Answer

You can use groupby to group by each crs and create a geodataframe with that crs and calculate lat/long. Then concatenate the dataframes into one or do whatever you want to do.

import geopandas as gpd
import pandas as pd

csv = r"/home/bera/Desktop/GIStest/meta.csv"

df = pd.read_csv(csv)

frames = [] #A list to hold each crs's dataframe

for crs, subframe in df.groupby("crs"): #Groupby each crs
    print(crs)
    subframe["geometry"] = gpd.points_from_xy(x=subframe["pos_east"], y=subframe["pos_north"], crs=crs) #Create a geometry column
    gdf = gpd.GeoDataFrame(data=subframe, geometry="geometry") #And a geodataframe
    gdf["lat"] = gdf.to_crs(4326).geometry.y #Calculate lat and long
    gdf["lon"] = gdf.to_crs(4326).geometry.x
    frames.append(gdf) #Append the gdf to frames list

#for frame in frames:
    #Do something

#Or concat them into one    
#You cant concat frames with different crs. Either drop the geometries or reproject them to the same crs
df2 = pd.concat([frame.drop(columns="geometry") for frame in frames]) 

  # sensor_id   pos_east   pos_north    crs        lat        lon
  # sensor1  109435.44  1157776.64   5110  59.416328  10.666176
  # sensor3  109444.23  1157734.32   5110  59.415948  10.666329
  # sensor2  254194.00  6593776.00  25833  59.410571  10.667913