Python – Extract Polygon Name DataFrame if Geo-point is Inside Polygon

geojsongeopandaspandaspythonshapely

I have two dataset one with polygon name and polygon and other with location name and latitude and longitude.

Data 1 (Geopandas Dataframe)

import geopandas as gpd
data_poly = gpd.read_file(path + "Data_community_file.geojson")

COMMUNITY NAME   POLYGON
New York         MULTIPOLYGON (((55.1993358199345 25.20971347951325, 
                 55.19385836251354 25.20134197109752.... 25.20971347951325)))
Chennai          MULTIPOLYGON (((65.1993358199345 22.20871347951325, 
                 55.19325836251354 15.20132197109752 .... 15.20971347951325))) 

Data 2 (Data Frame)

STOP NAME            LONGITUDE       LANGITUDE
Chennai main stop    55.307228       25.248844
Cabra stop           55.278824       25.205862
USA stop NY          55.069368       24.973946

If the data 2 (stop_name) is inside in the data 1 (polygon) need to extract the name of the polygon. ie. if the USA Stop NY is present in any "New York" need to add the name in the new column in data2.(You need to convert the lat and lot to Point(Lat,Lon) format for the below code.)

Sample code :

import json
    from shapely.geometry import shape, Point
    # depending on your version, use: from shapely.geometry import shape, Point

    # load GeoJSON file containing sectors
    with open('sectors.json') as f:
        js = json.load(f)

    # construct point based on lon/lat returned by geocoder
    point = Point(-122.7924463, 45.4519896)

    # check each polygon to see if it contains the point
for feature in js['features']:
    polygon = shape(feature['geometry'])
    if polygon.contains(point):
        print(feature)

The above code will able extract the polygon based on the "Point". How to apply the same for the data frame in the place of the point?

Best Answer

I present you a great operation: geopandas.sjoin()(docs). With this method you can know which geometries from a GeoDataFrame 'intersects', 'contains' or are 'within' the other. To use it, you must have two GeoDataFrame. In your case, you have to transform the lat/lon DataFrame to it, by doing something like this:

from shapely.geometry import Point
df['geometry'] = df.apply(lambda x: Point([x['LONGITUDE', x['LATITUDE']], axis=1)

Then you can use sjoin. As a result, you will have the columns of df concatenated to the data_poly on the rows where df is within data_poly.

import geopandas as gpd
joined_gdf = gpd.sjoin(df, data_poly, op='within')

UPDATE:

To perform sjoin you must have libspatialindex and rtree installed. You can do this without sudo with:

$ pip install rtree
$ conda install -c conda-forge libspatialindex 
Related Question