GeoPandas – How to Explode Column in Python

explodegeopandaspython

I have a shp with a column NOM_CONCAT and I can not explode it to have a unique information,
Each time, I have the same error message and I can't find solutions.

#TypeError: explode() takes 1 positional argument but 2 were given

SHAPEFILE

Code:

import pandas as pd
import numpy as np
import time
import datetime
import sys
import geopandas as gpd


gdf = gpd.read_file('RSF_Fiche_Canalisation-power_shp_ME_Position-04012021.shp')
gdf['NOM_CONCAT'] = gdf['NOM_CONCAT'].str.split(',  ')
gdf.NOM_CONCAT = gdf.NOM_CONCAT.fillna({i: [] for i in gdf.index})
gdf = gdf.explode('NOM_CONCAT')
print ("temp_complet_explode",gdf)

Current situation:

current situation

Situation I hope:

situation hope

Best Answer

Per the doc:

geodataframe explode doc
Source: https://geopandas.org/reference.html

It's a method which applies directly on your GeoDataFrame object, therefore any extra argument you pass will be counted as a 2nd argument, hence the error you face. Please notice that GeoDataFrame.explode() is intended to:

Explode muti-part geometries into multiple single geometries.

Therefore, the following is working because all geometries are multilines, so it splits them into new rows, each hosting one of the part from the original geometry:

import geopandas as gpd
from shapely import wkt
gdf = gpd.GeoDataFrame({
    'ID': [1,2,3,4,5],
    'identifiant': [11,12,13,14,15],
    'nom_concat': ['123abc',['123def','123ghj'],['123klm','123nop'],'123qrs','123tuv'],
    'geometry': [wkt.loads(mlt) for mlt in 5*'MULTILINESTRING((3 4,10 50,20 25),(-5 -8,-10 -8,-15 -4))+'[:-1].split('+')]
})

which results in:
results

And calling .explode() will actually explode your geometries: result after explode

But if you do not have multiple geometries:

import geopandas as gpd
from shapely import wkt
gdf2 = gpd.GeoDataFrame({
    'ID': [1,2,3,4,5],
    'identifiant': [11,12,13,14,15],
    'nom_concat': ['123abc',['123def','123ghj'],['123klm','123nop'],'123qrs','123tuv'],
    'geometry': [wkt.loads(mlt) for pkt in 5*'MULTILINESTRING((3 4,10 50,20 25),(-5 -8,-10 -8,-15 -4))+'[:-1].split('+')]
})

calling .explode() will return your original dataframe:

result of explode on simple geometries

What you want is probably to use pandas explode method instead, which is waiting for a column parameter:

import geopandas as gpd
from shapely import wkt
import pandas as pd

gdf2 = gpd.GeoDataFrame({
    'ID': [1,2,3,4,5],
    'identifiant': [11,12,13,14,15],
    'nom_concat': ['123abc',['123def','123ghj'],['123klm','123nop'],'123qrs','123tuv'],
    'geometry': [wkt.loads(mlt) for pkt in 5*'MULTILINESTRING((3 4,10 50,20 25),(-5 -8,-10 -8,-15 -4))+'[:-1].split('+')]
})

df = pd.DataFrame(gdf2) # convert to a panda DataFrame instance
type(df)

df.explode('nom_concat') # call pandas explode method on a column

pandas explode method applied

You can finally convert it back to a GeoDataFrame:

exploded_gdf = gpd.GeoDataFrame(df.explode('nom_concat')) 

Beware of the index changes after the explosion. You can obviously change it depending on your needs, e.g. with .reset_index().

Related Question