GeoPandas – Best Way to Check if a Point is in a Region Using GeoPandas

geopandaspandas

I'm working with Chicago PD crash data that includes Lat/Long. I've read the crashes data into the crashes DF.
It doesn't include the city ward number, so I'd like to add it. I figured out how to do this with a dictionary, where the ward number is the key and the ward boundary (geometry) is the value.

Using a function that takes a DF row as input and returns the ward number, I can add the ward number to each row in the DF.

import geopandas as gpd
import pandas as pd

wards = gpd.read_file('https://data.cityofchicago.org/api/geospatial/sp34-6z76?method=export&format=GeoJSON')
wards['ward'] = pd.to_numeric(wards['ward'])

wards_dd = {}
for index,row in wards.iterrows():
    wards_dd[row['ward']] = row['geometry']

def addWard(row):
    from shapely.geometry import Point 
    # iterate through the wards until we find the one continining the point and
    # return the ward number. 
    # Ward data is set up long/lat (x,y) format
    p = Point(row['longitude'],row['latitude'])
    for i in range(1,51):
        if wards_dd[i].contains(p):
            return i
    return 'NotFound'

crashes['ward'] = crashes.apply(addWard,axis = 1)

So this works, but it seems a bit kludgy. Is there a way to do this with Geopandas .sjoin function that would work better?

The Crashes DF is a standard (not geopandas) DF. The lat/long data is in separate columns. I could combine them into a shapely Point, but can a Point in a normal df be .sjoin-ed with a geopandas df to add the ward number to the Crashes DF?

PS I know there might be problems ignoring the curvature of the earth with this sort of question, but in this case, the areas are quite small (less than 1 mile across) so the curvature issue is negligible.

Best Answer

Spatial join

import geopandas as gpd
import pandas as pd

wards = gpd.read_file('https://data.cityofchicago.org/api/geospatial/sp34-6z76?method=export&format=GeoJSON')
wards['ward'] = pd.to_numeric(wards['ward'])

crashes = pd.read_csv(r'/home/bera/Desktop/crashes.csv') #Some random points
crashes['geometry'] = gpd.points_from_xy(crashes.xcoord, crashes.ycoord, crs="EPSG:4326")
crashes = gpd.GeoDataFrame(data=crashes, geometry='geometry')

crashes2 = gpd.sjoin(left_df=crashes, right_df=wards, how='left')
crashes2.to_file(r'/home/bera/Desktop/crashes_with_wards.gpkg')

enter image description here