[GIS] Why does spatial join on geodataframes return empty result

geodataframegeopandasintersectpythonspatial-join

I have two geodataframes that I want to perform a spatial join on; one is points, one is polygons. They are both dtype: int64. They both have .crs = {'init': 'epsg:4326'}. I have verified both of them in ArcMap that the points are indeed within the polys. I even performed an intersect between these points and polygons in ArcMap and it produced the correct new feature class (I have reasons for wanting to do this without arpy/manually in ArcMap). **Note: the points are from a shapefile and the polys are from a feature class
enter image description here

However, when I turn my centroid points and parcel polygons into geodataframes and use .sjoin(), it returns an empty geodataframe. I want the output to be a point-based geometry so that I can eventually turn it back into a point shapefile. I have tried:

intersect_test = gpd.sjoin(cents, polys, how='left', op='intersects')

and

intersect_test = gpd.sjoin(polys, cents, how='inner', op='intersects')

and

intersect_test = gpd.sjoin(polys, cents, how='right', op='intersects')

and

intersect_test = gpd.sjoin(cents, polys, how='left', op='intersects')

and pretty much every other configuration I can think of and it either returns the data for one of them, or returns a completely empty geodataframe result.

 intersect_test = gpd.sjoin(cents, polys, how='inner', op='intersects')
 intersect_test.count()
Out[126]: 
BLD_UNITS       0
LAND_USE_T      0
PARCEL_ID       0
PROP_IND_T      0
STORY_NBR       0
geometry        0
index_right     0
GEOID           0
CensusPop       0
CBArea          0
ST_FIPS         0
Shape_Length    0
Shape_Area      0
CO_FIPS         0
HU_Pop          0
Sq_Ft           0
dtype: int64

How can I resolve this without having to manually perform the intersect in ArcMap?

EDIT
Here's how I made my geodataframes. As I mentioned; one is a shapefile and one is a .gdb feature class. Question: Could this be due to crs-related problem?

#create poly gdf
cb_gdb = r"C:\Projects\Pop_Alloc\CensusBlocksStates.gdb"
cb_feat = "CBs_{}".format(state)
cents = gpd.read_file(cb_gdb, layer=cb_feat)
cents = {'init': 'epsg:4326'}

#create point gdf. The parcel centroids are first created from a polygon 
gdf. The new gdf is written to a shapefile to be tested against the census 
block polygons (to make sure they do in fact fall within the boundaries of 
the census blocks (cbs)) and the new centroid gdf(its a 
`type='geopandas.geodataframe.GeoDataFrame'`) is then used in the `.sjoin`       
cents = parcel_res_df
cents['geometry'] = 
cents['geometry'].representative_point()
cents_out_feat = r"C:\Projects\Pop_Alloc\{}_Res_centroids.shp".format(state)
cents = {'init': 'epsg:4326'}
cents.to_file(cents_out_feat)

Best Answer

This problem is a coordinate reference system issue. It is most likely caused by my census block feature class(aka polys) being in ESRI:102008. When I converted my polys feature class to a geodataframe, it showed {} when I ran polys.crs. Further investigation showed that the geometry looked like : (POLYGON ((1385968.793600001 83806.20800000057..., which was also incorrect. The solution required first finding the original crs, assigning the 'polys.crs' to the correct crs and then reprojecting to the desired crs. To do this, I used ogr/gdal:

import ogr
def get_proj(gdb_input, fc ):
    driver = ogr.GetDriverByName("OpenFileGDB")
    gdb = driver.Open(gdb_input, 0)
    print('There are {} total layers in {}'.format(gdb.GetLayerCount(), gdb_input))

    for i in range(gdb.GetLayerCount()):
        layer = gdb.GetLayerByIndex(i)
        if layer.GetDescription() == fc:
            break
    in_crs = layer.GetSpatialRef().ExportToProj4()
    return in_crs
    print('CRS = ', in_crs)

polys.crs = get_proj(poly_gdb, poly_name)
polys = cb_df.to_crs(cents.crs)
intersect = gpd.sjoin(ccents, polys, how='inner', op='intersects')

This returned the correctly intersected geodataframe. I have to say that I am surprised there is not documentation for what seems like a very common problem in using Geopandas with data from an ESRI context. Hope this helps.