[GIS] Extract raster values within shapefile with pygeoprocessing or gdal

gdalpygeoprocessingpython

I would like to know how to get all the raster values within a polygon using gdal or pygeoprocessing, without reading the entire grid as an array.

pygeoprocessing and gdal can do zonal statistics but only the min, max, mean, stdev or count are available from such a function. Since zonal statistics need to access the values, would it be easy to extract values the same way ?

I found a very similar question here :
(Getting pixel value of GDAL raster under OGR point without NumPy?)
but only for a particular "point".

Best Answer

You can use rasterio to extract the raster values within a polygon as in GIS SE: GDAL python cut geotiff image with geojson file

I use here a one band raster file and GeoPandas for the shapefile ( instead of Fiona)

enter image description here

import rasterio
from rasterio.mask import mask
import geopandas as gpd
shapefile = gpd.read_file("extraction.shp")
# extract the geometry in GeoJSON format
geoms = shapefile.geometry.values # list of shapely geometries
geometry = geoms[0] # shapely geometry
# transform to GeJSON format
from shapely.geometry import mapping
geoms = [mapping(geoms[0])]
# extract the raster values values within the polygon 
with rasterio.open("raster.tif") as src:
     out_image, out_transform = mask(src, geoms, crop=True)

The out_image result is a Numpy masked array

# no data values of the original raster
no_data=src.nodata
print no_data
-9999.0
# extract the values of the masked array
data = out_image.data[0]
# extract the row, columns of the valid values
import numpy as np
row, col = np.where(data != no_data) 
elev = np.extract(data != no_data, data)

Now I use How to I get the coordinates of a cell in a geotif? or Python affine transforms to transform between the pixel and projected coordinates with out_transform as the affine transform for the subset data

 from rasterio import Affine # or from affine import Affine
 T1 = out_transform * Affine.translation(0.5, 0.5) # reference the pixel centre
 rc2xy = lambda r, c: (c, r) * T1  

Creation of a new resulting GeoDataFrame with the col, row and elevation values

d = gpd.GeoDataFrame({'col':col,'row':row,'elev':elev})
# coordinate transformation
d['x'] = d.apply(lambda row: rc2xy(row.row,row.col)[0], axis=1)
d['y'] = d.apply(lambda row: rc2xy(row.row,row.col)[1], axis=1)
# geometry
from shapely.geometry import Point
d['geometry'] =d.apply(lambda row: Point(row['x'], row['y']), axis=1)
# first 2 points
d.head(2)
     row  col   elev       x          y            geometry  
 0    1    2  201.7!  203590.58  89773.50  POINT (203590.58 89773.50)  
 1    1    3  200.17  203625.97  89773.50  POINT (203625.97 89773.50)

# save to a shapefile
d.to_file('result.shp', driver='ESRI Shapefile')

enter image description here