I am trying to estimate summary statistics of points located inside a polygon. Each points in the point layer is associated with N attributes. My target is to summarize (say mean, min,..) the attributes of points located within a polygon and populate the attribute fields of the corresponding polygon.
I am looking for a solution using GeoPandas or other Python libraries.
import geopandas as gpd
gdf_points = gpd.read_file('/path_to_points.json')
gdf_polygon = gpd.read_file('/path_to_polygons.json')
dfsjoin = gpd.sjoin(gdf_polygon ,gdf_points)
Now, how can I summarize the stats for each attribute in the point layer and add it to the polygon shapefile? Which function can I use?
I am looking for something that is functionally equivalent to ESRI ArcGIS SpatialJoin_analysis with fieldmappings
Best Answer
First of all it is necessary to determine the points that are contained in the polygons and which points in which polygons
Use a spatial join (as in More Efficient Spatial join in Python without QGIS, ArcGIS, PostGIS, etc for example)
The points id 1,2,3 are contained in the polygon 1 (id_right), etc...
Control of the number of points contained in the polygons
To summarize the stats for each attribute in the point layer and add it to the polygon layer, group the points_polys by the id_right column (= polygons) and compute the mean, standard deviation, max and min of the attributes of each group of points (Naming returned columns in Pandas aggregate function)
It is also possible to use Named aggregations (Pandas in 2019 - let's see what's new!)
Finally join this DataFrame to the polygon GeoDataFrame and save the resulting layer
With value1_std as label: