Python – Incorrect Transformation Issue in GeoPandas

coordinate systemgeopandaspython

Using GeoPandas, I am loading a csv with log data from a DJI drone, which contains longitude and latitude in WGS84 (4326). For my project, I am working in EPSG:6559 (esri calls it "NAD 1983 (2011) StatePlane Oregon North FIPS 3601 (Intl Feet)").

No issues loading the data, but when I transform my drone log with GeoPandas from 4326 to 6559, the logs are shifted a few feet from their correct position. When I load the same logs into ArcGIS Pro, project them to 6559, export them (to gpkg), and load the pre-projected data with GeoPandas, there is no error.

I was experiencing a similar shift in ArcGIS. This was because I had given my data vertical datums, and ArcGIS couldn't find a transformation to go from one combination of CRS/vertical datum to the other combination. I resolved this by removing the vertical datums. However, in GeoPandas, I am not using vertical datums at all.

Am I doing something wrong? How do I get GeoPandas to transform my log data correctly? (Worst case scenario is that I can simply translate the data to correct it, but clearly this stuff should work correctly.)

(Edit: add what transformation details I could find.) ArcGIS says its datum transformation is: WGS_1984_(ITRF08)_To_NAD_1983_2011 and gives a wkid of 108363. pyproj says Inverse of NAD83(2011) to WGS 84 (1) and says EPSG:9774. So maybe the ITRF08 flavor of WGS84 is different than whatever flavor proj is using. I mean, I had figured the issue was something like that. What to do about it?

Relevant Code:

import pandas as pd
import geopandas as gp
import matplotlib.pyplot as plt

LOG = "data/logs/Oct-17th-2023-02-53PM-Flight-Airdata.csv"
WGS84 = "EPSG:4326"
OR_NORTH = "EPSG:6559"  # NAD83(2011) / Oregon North (intl ft)

# geopandas is making everything an object type instead of int, float, etc.
# not sure why, but opening the file in pandas first, then converting to geodataframe
# fixes it, and gives and chance to set geospatial stuff.
raw_data = pd.read_csv(LOG)
shifted_data = gp.GeoDataFrame(data=raw_data,
                     geometry=gp.points_from_xy(
                         x=raw_data["longitude"],
                         y=raw_data["latitude"]),
                     crs=WGS84)

print("shifted crs before transform", shifted_data.crs)
shifted_data = shifted_data.to_crs(crs=OR_NORTH)
print("shifted crs after transform", shifted_data.crs)

correct_data = gp.read_file("gis/plates.gpkg", layer="flights_2_53_6559")
print("correct crs", correct_data.crs)

parking_spaces = gp.read_file("gis/plates.gpkg", layer="spaces")
print("spaces crs", parking_spaces.crs)

# create plot
# red = bad
# green = good
# parking spaces 'baselayer' for reference
_, ax = plt.subplots(ncols=1, figsize=(10, 10), dpi=300)
parking_spaces.plot(ax=ax, column="type")
shifted_data.plot(ax=ax, color="#F00", markersize=1)
correct_data.plot(ax=ax, color="#0F0", markersize=1)
plt.show()

Output

shifted crs before transform EPSG:4326
shifted crs after transform EPSG:6559
correct crs EPSG:6559
spaces crs EPSG:6559

Plot: Green=good, Red=bad

Best Answer

Using EPSG:8999 or EPSG:7911 instead of EPSG:4326 makes the "shifted" data align with the "correct" data. That is really frustrating and annoying, both in ArcGIS and GeoPandas/pyproj. I can only guess that ArcGIS is automagically choosing to use a newer WGS datum or datum transformation, because when I imported my csv in ArcGIS, I definitely told it to use "WGS84" and it used 4326. And pyproj isn't (again, guessing) doing you any complementary service, just giving you A→B just as you ask it to.

Related Question