GeoPandas Encoding – Encoding Issue While Making GeoDataFrame from Shapefile Using URL

encodinggeodataframegeopandasurl

The dataset I want to use have some special characters like ü, õ, ö, ä. which cause encoding issue.

The below example runs totally fine and doesnt mess up those characters.

gdf_local = gpd.read_file(r'my-local-machine-path\maakond_shp.zip', encoding = 'iso-8859-15')

However the problem is that I want to create multiple geo data frames, plus have to share script only, so its not feasible to download all of those shapefiles to my machine first and hence I want to create geodataframes directly from the URL. one of the example below works but messes up the special characters.

url = 'https://geoportaal.maaamet.ee/docs/haldus_asustus/maakond_shp.zip?t=20220501011007'
gdf_url = gpd.read_file(url) # it works but messes up those special characters

Next I tried the following

gdf_url = gpd.read_file(url , encoding = 'iso-8859-15') # it gives error

But it threw this error: TypeError: init() got multiple values for keyword argument 'encoding'

Apparently geopandas try utf-8 by default which is causing issue in my case but doesn't let me use different encoding.

Is there any workaround or solution for this?

Broken attempts:

gdf['mycol'] = gdf.mycol.str.encode('iso-8859-15') # messed it way more. tried using different encodings etc but no help

tried requests method to make file first then gdf but didnt worked.

finally tried the following solution Reading raw data into geopandas but it also messed up those characters.

url = 'https://geoportaal.maaamet.ee/docs/haldus_asustus/maakond_shp.zip?t=20220501011007'
request = requests.get(url)
b = bytes(request.content)

with fiona.BytesCollection(b) as f:
    crs = f.crs
    gdf = gpd.GeoDataFrame.from_features(f, crs=crs)
    print(gdf.head())

Best Answer

This seems to be a bug in a dependency of geopandas, the fiona package. It was discussed on github here and should be fixed on a newer release (>=1.8.21).

So try updating the fiona package and see if it works.

Related Question