[GIS] Spatial join using python shapely and fiona

fionageopandaspythonshapefileshapely

I want to create a python script when count how many points(from point shapefile) is in the polygons(polygon shapefile) and I want that count to create a new polygon shapefiles with the polygons where have upper to one points within.

using fiona for inputs shapefiles

filepath1 = "polygon.shp"
file1features = []
outschema = None
with fiona.collection(filepath1, "r") as input1:
    outschema = input1.schema.copy()
    for p1 in input1:
        file1features.append(p1)
filepath2 = "point.shp"
file2features = []
outschema = None
with fiona.collection(filepath2, "r") as input1:
    outschema = input2.schema.copy()
    for p1 in input2:
        file2features.append(p1)

I have find this code from github for spatial join but how to connect that code woth my code for inputs shapefiles and how to crate a new polygon shapefile with upper count > 1 ?

import numpy as np
import pandas as pd
from shapely import prepared


def sjoin(left_df, right_df, how='inner', op='intersects',
          lsuffix='left', rsuffix='right'):
    """Spatial join of two GeoDataFrames.
    Parameters
    ----------
    left_df, right_df : GeoDataFrames
    how : string, default 'inner'
        The type of join:
        * 'left': use keys from left_df; retain only left_df geometry column
        * 'right': use keys from right_df; retain only right_df geometry column
        * 'inner': use intersection of keys from both dfs; retain only
          left_df geometry column
    op : string, default 'intersection'
        Binary predicate, one of {'intersects', 'contains', 'within'}.
        See http://toblerity.org/shapely/manual.html#binary-predicates.
    lsuffix : string, default 'left'
        Suffix to apply to overlapping column names (left GeoDataFrame).
    rsuffix : string, default 'right'
        Suffix to apply to overlapping column names (right GeoDataFrame).
    """
    import rtree

    allowed_hows = ['left', 'right', 'inner']
    if how not in allowed_hows:
        raise ValueError("`how` was \"%s\" but is expected to be in %s" % \
            (how, allowed_hows))

    allowed_ops = ['contains', 'within', 'intersects']
    if op not in allowed_ops:
        raise ValueError("`op` was \"%s\" but is expected to be in %s" % \
            (op, allowed_ops))

    if op == "within":
        # within implemented as the inverse of contains; swap names
        left_df, right_df = right_df, left_df

    if left_df.crs != right_df.crs:
        print('Warning: CRS does not match!')

    tree_idx = rtree.index.Index()
    right_df_bounds = right_df.geometry.apply(lambda x: x.bounds)
    for i in right_df_bounds.index:
        tree_idx.insert(i, right_df_bounds[i])

    idxmatch = (left_df.geometry.apply(lambda x: x.bounds)
                .apply(lambda x: list(tree_idx.intersection(x))))
    idxmatch = idxmatch[idxmatch.apply(len) > 0]

    if idxmatch.shape[0] > 0:
        # if output from  join has overlapping geometries
        r_idx = np.concatenate(idxmatch.values)
        l_idx = np.concatenate([[i] * len(v) for i, v in idxmatch.iteritems()])

        # Vectorize predicate operations
        def find_intersects(a1, a2):
            return a1.intersects(a2)

        def find_contains(a1, a2):
            return a1.contains(a2)

        predicate_d = {'intersects': find_intersects,
                       'contains': find_contains,
                       'within': find_contains}

        check_predicates = np.vectorize(predicate_d[op])

        result = (
                  pd.DataFrame(
                      np.column_stack(
                          [l_idx,
                           r_idx,
                           check_predicates(
                               left_df.geometry
                               .apply(lambda x: prepared.prep(x))[l_idx],
                               right_df[right_df.geometry.name][r_idx])
                           ]))
                   )

        result.columns = ['index_%s' % lsuffix, 'index_%s' % rsuffix, 'match_bool']
        result = (
                  pd.DataFrame(result[result['match_bool']==1])
                  .drop('match_bool', axis=1)
                  )

    else:
        # when output from the join has no overlapping geometries
        result = pd.DataFrame(columns=['index_%s' % lsuffix, 'index_%s' % rsuffix])

    if op == "within":
        # within implemented as the inverse of contains; swap names
        left_df, right_df = right_df, left_df
        result = result.rename(columns={
                    'index_%s' % (lsuffix): 'index_%s' % (rsuffix),
                    'index_%s' % (rsuffix): 'index_%s' % (lsuffix)})

    if how == 'inner':
        result = result.set_index('index_%s' % lsuffix)
        return (
                left_df
                .merge(result, left_index=True, right_index=True)
                .merge(right_df.drop(right_df.geometry.name, axis=1),
                    left_on='index_%s' % rsuffix, right_index=True,
                    suffixes=('_%s' % lsuffix, '_%s' % rsuffix))
                )
    elif how == 'left':
        result = result.set_index('index_%s' % lsuffix)
        return (
                left_df
                .merge(result, left_index=True, right_index=True, how='left')
                .merge(right_df.drop(right_df.geometry.name, axis=1),
                    how='left', left_on='index_%s' % rsuffix, right_index=True,
                    suffixes=('_%s' % lsuffix, '_%s' % rsuffix))
                )
    elif how == 'right':
        return (
                left_df
                .drop(left_df.geometry.name, axis=1)
                .merge(result.merge(right_df,
                    left_on='index_%s' % rsuffix, right_index=True,
                    how='right'), left_index=True,
                    right_on='index_%s' % lsuffix, how='right')
                .set_index('index_%s' % rsuffix)
)

I thing so like this for new shapefile ?

if intersectfeatures:
    outfile = "utputfile.shp"
    with fiona.collection(outfile, "w", "ESRI Shapefile", outschema) as output:
        for outfeature in intersectfeatures:
            output.write(outfeature)

But I can't to connection, any idea ?

Best Answer

For the first, you could to create a spatial join where you merge points to polygons, so you'd get a frequency of points per polygon as an attribute. Spatial joins are covered at geopandas here and are reviewed in a simple notebook here. The code snippet you include from their site is more complicated, and what's "under the hood" of their library. You could add geopandas as a dependency, and that will even reduce your fiona code.

For the second goal you state, do you really need a new shapefile with counts greater than 0, or is it a visualization task (where you select to visualize polygons with 1 or more points)? If you need a new shapefile, you'd select your polygons with frequency > 0, save as a GeoDataFrame, and then export as a shapefile.

Note: You'll need to also install rtree to get the spatial join to work properly in geopandas.