OGR – Filter Vector Layer by List of Feature IDs

gdalogrpython

I often use GDAL/OGR via their python bindings.

I know of the very useful ogr.Layer.SetAttributeFilter() command. However I am hoping to filter by feature ID's, rather than attributes. Also, I don't want to filter by just one value, but rather by a set of values. Ideally I could pass a list of feature ID's to the filter — is there some method that can allow me to do this (either through SetAttributeFilter or otherwise)?

It occurred to me that I could copy the feature ID's to their own attribute field, but it would be nice to avoid having to do this.

CLARIFICATION: I am hoping to apply filters for read-only processes which would involve sorting, selecting & exporting/rasterizing the layer geometries. I primarily use ESRI Shapefile format.

Best Answer

For an ESRI Shapefile:

This can be accomplished using the ogr.layer.SetAttributeFilter() method. Feature ID's in a shapefile are stored in the FID field. This field cannot be accessed via OGR's GetField method (it is not a 'proper' attribute field in that it is not stored in the .dbf file), however it is apparently accessible through SQL queries. I was able to do this as follows:

## Open shapefile geometry
driver = ogr.GetDriverByName('ESRI Shapefile')
datasource = driver.Open(shapefile_path, 0)
layer = datasource.GetLayer()

## Store ID's of desired features in a list
fid_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

## Pass ID's to a SQL query as a tuple, i.e. "(1, 2, 3, ...)"
layer.SetAttributeFilter("FID IN {}".format(tuple(fid_list)))

## The filter is now active & we can work with the layer as we like!

## For example, we can rasterize the layer & only the features
## that we specified in 'fid_list' will be included in the output

## Initialize output raster
dataset_out = gdal.GetDriverByName('GTiff').Create(
                   output_path, cols, rows, 1, gdal.GDT_Byte)
dataset_out.SetGeoTransform(gt)
dataset_out.SetProjection(srs)

## Rasterize filtered layer
gdal.RasterizeLayer(output_path, [1], layer,
    options=["ATTRIBUTE={}".format(field_name), "ALL_TOUCHED=TRUE"])

NOTE: the FID field of a shapefile can change under some circumstances. For example, if a feature is deleted, the feature ID's will be automatically renumbered so that the numbering is sequential and without gaps. So, this approach would probably not be advisable unless the file is on your local network & you do not intend to do any editing while the filter is in place (ideally with the shapefile opened in read-only mode).

Of course, you could just create a new static attribute field which could hold the FID values or some other unique identifier. The approach detailed above would still work in this case.

Related Question