[GIS] Geopandas Line Polygon Intersection

geopandasintersectionpythonshapely

I'm trying to find where multiple lines intersect a polygon for two different geodataframes:

from shapely.geometry import Polygon, LineString
import geopandas as gpd

polygon = Polygon([(0, 0), (1, 0), (1, 1), (0, 1), (0, 0)])
line1 = LineString([(0.5, 0.5), (0.7, 0.7)])
line2 = LineString([(0.9, 0.9), (0.2, 0.6)])


poly_gdf = gpd.GeoDataFrame(geometry=[polygon])
line_gdf = gpd.GeoDataFrame(geometry=[line1, line2])

This is what the above geodataframes look like (one has a polygon and the other has two lines). It looks to me as if both lines intersect the polygon:

Polygon and Lines

However, the intersect output is very confusing:

print(line_gdf.intersects(poly_gdf))

0 True

1 False

print(line1.intersects(polygon))
print(line2.intersects(polygon))

True

True

Why does the geopandas intersect method give a different output to the standard shapely one?

I am using Python 3.5.3 and Geopandas 0.2.1 all on Anaconda.

Best Answer

When comparing geodataframes with geometry operations in Geopandas, the geometries are first matched by index. In the case where there is no matching index (because you only have a single polygon for instance) then the result will be False.

If it were to compare each object in the GeoSeries you would instead need to get back a full rectangular dataframe of boolean values, and this would likely be very inefficient.

If you do want to compare all geometries then you have two options. The first (and probably easiest) is to use the geopandas sjoin method:

gpd.sjoin(line_gdf, poly_gdf, op='intersects')

This returns a new GeoDataFrame with the geometries for each object on the left dataframe repeated for each geometry they intersect in the right, with the index of the object in the right, i.e.:

                        geometry  index_right
0  LINESTRING (0.5 0.5, 0.7 0.7)            0
1  LINESTRING (0.9 0.9, 0.2 0.6)            0

The second method is to us the pandas apply method on the GeoSeries to return the rectangular dataframe:

line_gdf.geometry.apply(lambda g: poly_gdf.intersects(g))

Which in turn returns (with increasing inefficiency as the dataframes grow):

index_right     0
index_left
0            True
1            True

In general, unless you needed the square matrix, my advice would be to stick to the sjoin method.