[GIS] Converting Pandas DataFrame to GeoDataFrame

csvgeodataframegeopandaspandas

This seems like a simple enough question, but I can't figure out how to convert a Pandas DataFrame to a GeoDataFrame for a spatial join?

Here is an example of what my data looks like using df.head():

    Date/Time           Lat       Lon       ID
0   4/1/2014 0:11:00    40.7690   -73.9549  140
1   4/1/2014 0:17:00    40.7267   -74.0345  NaN

In fact, this dataframe was created from a CSV so if it's easier to read the CSV in directly as a GeoDataFrame that's fine too.

Best Answer

Convert the DataFrame's content (e.g. Lat and Lon columns) into appropriate Shapely geometries first and then use them together with the original DataFrame to create a GeoDataFrame.

from geopandas import GeoDataFrame
from shapely.geometry import Point

geometry = [Point(xy) for xy in zip(df.Lon, df.Lat)]
df = df.drop(['Lon', 'Lat'], axis=1)
gdf = GeoDataFrame(df, crs="EPSG:4326", geometry=geometry)

Result:

    Date/Time           ID      geometry
0   4/1/2014 0:11:00    140     POINT (-73.95489999999999 40.769)
1   4/1/2014 0:17:00    NaN     POINT (-74.03449999999999 40.7267)

Since the geometries often come in the WKT format, I thought I'd include an example for that case as well:

import geopandas as gpd
import shapely.wkt

geometry = df['wktcolumn'].map(shapely.wkt.loads)
df = df.drop('wktcolumn', axis=1)
gdf = gpd.GeoDataFrame(df, crs="EPSG:4326", geometry=geometry)