I am using Python Geopandas 0.12.2 to read and reproject a shapefile. My code works on Windows 10 but does not produce the correct results on Linux. Specifically, the Linux implementation produces a geometry that consists entirely of 'inf' values (see output below). I have used the same environment.yml file to build the same conda/mamba environment on both systems. The source shapefile includes a single polygon with crs=EPSG 4269.
To get the correct behavior on Linux I've tried (without success) a handful of interventions aimed at lower-level control using Shapely and PyProj rather than Geopandas; these interventions were informed by the discussions involving PyProj axis order and the 'always_xy' parameter (e.g., https://stackoverflow.com/questions/60480010/python-pyproj-transform-yielding-different-results-for-the-same-input-parameters).
But my prior question is: Should this reprojection command require different implementation or arguments on Linux?
Here is the code:
import os
import geopandas as gpd
import pyproj
PROJ4 = "+datum=WGS84 +lat_0=23 +lat_1=29.5 +lat_2=45.5 +lon_0=-96 +no_defs +proj=aea +units=m +x_0=0 +y_0=0"
crs_to = pyproj.CRS.from_proj4(PROJ4)
# For reproducibility: Generate a sample GeoDataframe from a subset of the
# coordinates of my actual polygon shapefile
coords = [[-74.051, 42.818], [-74.0496, 42.819], [-74.0495, 42.817], [-74.0495, 42.817], [-74.051, 42.818]]
geojson={"type":"FeatureCollection", "features":[{"type":"Feature", "properties":{"id":1},
"geometry":{"type":"MultiPolygon", "coordinates":[[coords]]}}]}
shp_from = gpd.GeoDataFrame.from_features(geojson, crs='EPSG:4269')
shp_to = shp_from.to_crs(crs_to)
print('\nos name: ', os.name)
print('geopandas version: ', gpd.__version__)
print('pyproj version: ', pyproj.__version__)
print('crs_from: ', shp_from.crs)
print('crs_to: ', crs_to)
print('\n\nSource geometry')
print(shp_from.geometry)
print('\n\nTarget geometry')
print(shp_to.geometry)
[UPDATE]: On Linux, os.environ['PROJ_DATA'] = /path/to/env/share/proj. On Windows, there is no PROJ_DATA in os.environ.
Here is the output on Windows:
os name: nt
geopandas version: 0.12.2
pyproj version: 3.4.1
crs_from: epsg:4269
crs_to: +proj=aea +datum=WGS84 +lat_0=23 +lat_1=29.5 +lat_2=45.5 +lon_0=-96 +no_defs +units=m +x_0=0 +y_0=0 +type=crs
Environment variables include: ['PROJ_CURL_CA_BUNDLE']
No PROJ_DATA in environment
Source geometry
0 MULTIPOLYGON (((-74.05100 42.81800, -74.04960 ...
Name: geometry, dtype: geometry
Target geometry
0 MULTIPOLYGON (((1768715.829 2407533.606, 17688...
Name: geometry, dtype: geometry
Here is the output on Linux:
os name: posix
geopandas version: 0.12.2
pyproj version: 3.4.1
crs_from: epsg:4269
crs_to: +proj=aea +datum=WGS84 +lat_0=23 +lat_1=29.5 +lat_2=45.5 +lon_0=-96 +no_defs +units=m +x_0=0 +y_0=0 +type=crs
Environment variables include: ['PROJ_DATA', 'PROJ_NETWORK', 'PROJ_CURL_CA_BUNDLE']
PROJ_DATA = /home/wzell/mambaforge/envs/hyriver/share/proj
Source geometry
0 MULTIPOLYGON (((-74.05100 42.81800, -74.04960 ...
Name: geometry, dtype: geometry
Target geometry
0 MULTIPOLYGON ((inf inf, inf inf, inf inf, inf inf, ...
Name: geometry, dtype: geometry
Best Answer
It's not Windows and Linux behaving differently, it's something amiss in your Linux environment. I'm running your script on Linux with no issues and get the correct output. It may be that
proj
can't reach the internet to access the transformation grids from your Linux device.Try
os.environ['PROJ_NETWORK'] = 'OFF'
ala First call to transform() fails with inf, all subsequent calls are OK - what could be the reason?E.g.
Some other refs: