ArcGIS – Converting Dataframe with WKT Column to Spatial Dataframe in Python API

arcgis-python-apipythonshapelywell-known-text

I am querying a data set from an Oracle spatial server with the geometry field in a WKT format, and I am running into issues trying to convert it to a spatial dataframe using the ArcGIS API. Going through the API documentation, there's no direct way to import a WKT, nor do I find that in example projects. The GeoAccessor.from_df() function has a geometry_column parameter but this only seems to be expecting the geometry in the ESRI format (although there's no mention of this on the docs).

I'm using the API instead of GeoPandas because after I'm done massaging the data, I need to post it to an ArcGIS online page and the API seems to handle this pretty easily. The only similar use case I found on the ESRI forums, converts the WKT into a shapely object, then uses the Geometry.from_shapely() function, but this keeps returning a '_HASSHAPELY' is not defined error. The only requirements are that it results in a spatial dataframe and doesn't require ArcPy. This seems too obvious and I feel like I'm missing something.

This is when I tried it directly

import arcgis
from arcgis.geometry import Geometry
from arcgis.features import GeoAccessor, GeoSeriesAccessor
import pandas as pd
import shapely
from shapely import wkt

print(
    f'Arcgis: {arcgis.__version__}  \n'
    f'Shapely: {shapely.__version__}  \n'
    f'Pandas: {pd.__version__}'
)

>>> Arcgis: 2.0.0, Shapely: 1.7.1, Pandas: 1.3.4

# conn = connection string goes here
df = pd.read_sql('SELECT * FROM DB', con=conn)
df

>>>   objectid                                     shape_wkt
0     83081       POLYGON ((2650551.5177502 224562.792018803, 26...
1     25643       POLYGON ((2646974.1601797 228173.696717298, 26...
2     13084       POLYGON ((2483098.44031584 209641.4424441, 248...
3     83086       POLYGON ((2600620.00982339 205015.101437584, 2...
4     22395       POLYGON ((2589453.06485915 204928.628606869, 2...
..         ...                                                ...

sdf = GeoAccessor.from_df(df,
                          address_column=None,
                          geometry_column='shape_wkt')
sdf

>>> objectid shape_wkt
0   83081       {}
1   25643       {}
2   13084       {}
3   83086       {}
4   22395       {}
..         ...                                                ...

And this is when I tried it the longer hacky way

# conn = connection string goes here
df = pd.read_sql('SELECT * FROM DB', con=conn)
df

>>>   objectid                                     shape_wkt
0     83081       POLYGON ((2650551.5177502 224562.792018803, 26...
1     25643       POLYGON ((2646974.1601797 228173.696717298, 26...
2     13084       POLYGON ((2483098.44031584 209641.4424441, 248...
3     83086       POLYGON ((2600620.00982339 205015.101437584, 2...
4     22395       POLYGON ((2589453.06485915 204928.628606869, 2...
..         ...                                                ...

df['shapely_geom'] = df.shape_wkt.apply(wkt.loads)
df

>>> objectid                                     shape_wkt                                         shapely_geom
0   83081       POLYGON ((2650551.5177502 224562.792018803, 26...   POLYGON ((2650551.5177502 224562.792018803, 26...
1   25643       POLYGON ((2646974.1601797 228173.696717298, 26...   POLYGON ((2646974.1601797 228173.696717298, 26...
2   13084       POLYGON ((2483098.44031584 209641.4424441, 248...   POLYGON ((2483098.44031584 209641.4424441, 248...
3   83086       POLYGON ((2600620.00982339 205015.101437584, 2...   POLYGON ((2600620.00982339 205015.101437584, 2...
4   22395       POLYGON ((2589453.06485915 204928.628606869, 2...   POLYGON ((2589453.06485915 204928.628606869, 2...
... ... ... ..

type(df.shapely_geom[0])
>>> shapely.geometry.polygon.Polygon

df.shapely_geom[0]
>>>

shapely_polygon

df['esri_geom'] = df.shapely_geom.apply(Geometry.from_shapely)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-48-5c763be36efd> in <module>
----> 1 df['esri_geom'] = df.shapely_geom.apply(Geometry.from_shapely)

~\Miniconda3\envs\map_env\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwargs)
   4355         dtype: float64
   4356         """
-> 4357         return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
   4358 
   4359     def _reduce(

~\Miniconda3\envs\map_env\lib\site-packages\pandas\core\apply.py in apply(self)
   1041             return self.apply_str()
   1042 
-> 1043         return self.apply_standard()
   1044 
   1045     def agg(self):

~\Miniconda3\envs\map_env\lib\site-packages\pandas\core\apply.py in apply_standard(self)
   1099                     values,
   1100                     f,  # type: ignore[arg-type]
-> 1101                     convert=self.convert_dtype,
   1102                 )
   1103 

~\Miniconda3\envs\map_env\lib\site-packages\pandas\_libs\lib.pyx in pandas._libs.lib.map_infer()

~\Miniconda3\envs\map_env\lib\site-packages\arcgis\geometry\_types.py in from_shapely(cls, shapely_geometry, spatial_reference)
    875 
    876         """
--> 877         if _HASSHAPELY:
    878             gj = shapely_geometry.__geo_interface__
    879             geom_cls = _geojson_type_to_esri_type(gj["type"])

NameError: name '_HASSHAPELY' is not defined

The shapely package is obviously imported, the geometry is listed as a shapely geometry, and even browsing through the rows, the polygon will render in a notebook, so no idea why the HASSHAPELY variable would have a not defined error.

Best Answer

Looks like this issue was reported to the arcgis-python repo and has been fixed but not yet released.

As a workaround you may consider rolling the ArcGIS Python package back to 1.9.1 rather than the current 2.0.0 release.

You may also try setting arcgis.geometry._types._HASSHAPELY = True.

Related Question