Python Shapefile – Removing Polygons Based on Size Without Using ArcPy

polygonpythonshapefile

I am new to GIS. I want to remove some smaller polygons and only leave the polygons with a larger areas in Python scripts. Like this:

enter image description here

Here is the source files Shape files: buildings. I am thinking whether I can set a threshold for the size of the areas then remove them, but I don't know how to do this in Python.

Using the help by @Iridium , then I am able to compute the area of the polygons by this link Calculate area of polygons using OGR in python script

from osgeo import ogr

driver = ogr.GetDriverByName("ESRI Shapefile")
dataSource = driver.Open(shapefile_to_be_used 1)
layer = dataSource.GetLayer()
new_field = ogr.FieldDefn("Area", ogr.OFTReal)
new_field.SetWidth(32)
new_field.SetPrecision(2) #added line to set precision
layer.CreateField(new_field)

for feature in layer:
    geom = feature.GetGeometryRef()
    area = geom.GetArea() 
    print (area)
    feature.SetField("Area", area)
    layer.SetFeature(feature)

dataSource = None 

But how to filter out the polygons with small areas?

Best Answer

If you want a way to do these simple types of spatial operations in Python you can try geopandas which is a spatial extension to the (super popular and useful) pandas library.

As a quick example, firstly read the data in as a Geodataframe object, and reproject to a local projection (not being from South Africa, I hope this is right!).

import geopandas as gpd

buildings = gpd.read_file('buildings.shp').to_crs(epsg=22281)

At this point it's pretty easy to calculate the area for each geometry, and then threshold by some arbitrary number.

mask = buildings.area > 1000.  # metres squared

Finally, use the mask and fancy pandas indexing to mask out the points of interest:

selected_buildings = buildings.loc[mask]

At this point you can plot the data out:

ax = buildings.plot(color="gray", figsize=(20, 8))
selected_buildings.plot(ax=ax)

Subsetted buildings in blue

Or save the data back out to disk:

selected_buildings.to_file('filtered_buildings.shp')
Related Question