[GIS] How to convert shapefile to raster and mask using GDAL Python 3

gdalpython

I have a polygon and filelist of data I need to convert my polygon to raster and mask my filelist with this mask and campute the statistics in the area in polygon. I'm using GDAL Python.

I can do it using R like this :

maskraster <- rasterize(polygon, raster, mask = TRUE) 
#calcul mean of area in polygons 
Rasternew <- cellStats(maskraster,stat=mean)

How can I do the same using GDAL in Python ?

Best Answer

I have a script which allows me to clip a raster to the polygon of a shapefile, and then compute statistics on it (including, as in this case) the mean. This example includes the modules pillow (https://pypi.org/project/Pillow/) and shapefile (https://pypi.org/project/pyshp/) on Python as well.

from osgeo import gdal, osr
import shapefile
from PIL import Image, ImageDraw

def world_to_image(lat_value,lon_value,dataset):
    '''
    This function is required for converting from lat/lon values to the corresponding values in the image coordinates.
    '''
    metadata = dataset.GetGeoTransform()
    # metadata is an array with six values: ulx, xres, xskew, uly, yskew, yres

    # The source projection (lat, long coords)
    source = osr.SpatialReference()
    source.ImportFromEPSG(4326)

    # The target projection (projection of the image)
    target = osr.SpatialReference()
    target.ImportFromWkt(dataset.GetProjection())

    # Create the transform - this can be used repeatedly
    transform = osr.CoordinateTransformation(source,target)
    new_point = transform.TransformPoint(long_value,lat_value)

    x_pixel = int( round((new_point[0] - metadata[0]) / metadata[1],0)) #matrix[:,*value*]
    y_pixel = int( round((new_point[1] - metadata[3]) / metadata[5],0)) #matrix[*value*,:]
    return y_pixel,x_pixel

dataset = gdal.Open("path/to/file")
band = dataset.GetRasterBand("band number")
band = band.ReadAsArray()
band = band.astype(float) #usually ensures no weirdness later when doing maths steps

sf = shapefile.Reader("path/to/shapefile")
minX, minY, maxX, maxY = sf.bbox #Get corner coordinates of shapefile

ulX,ulY = world_to_image(maxY,minX,dataset) #Convert shapefile corners to image 
lrX,lrY = world_to_image(minY,maxX,dataset)

pxwidth = int(lrX-ulX) #number of pixels for width of image
pxheight = int(lrY-ulY) #number of pixels for height of image
clip = band[ulY:lrY,ulX:lrX] #clipping original image to shapefile bounds

pixels = []
for p in sf.shape(0).points:
    temp = world_to_image(p[1],p[0],dataset)
    pixels.append((temp[1],temp[0])) #convert the points of the polygon to image coordinates 

rasterpoly = Image.new('L',(band.shape),1) #set up new image in the same shape as clipped raster image
rasterize = ImageDraw.Draw(rasterpoly) 
rasterize.polygon(pixels,0) #cuts empty image to exact shape of shapefile
mask = np.array(rasterpoly) #converts it to numpy array

'''
function which cuts the original raster image to the shape of the shapefile, 
leaving nodata points elsewhere.
'''
clip = gdal_array.numpy.choose(mask, (band, np.nan)).astype(gdal_array.numpy.float)

Apologies for the rather lengthy code snippet - it is cut from a longer body of code but I have tried to cut out everything unnecessary. This code should produce a 2D numpy array which you can then perform your statistical functions on. Apologies if this is not exactly the sort of thing you were looking for - it was quite hard to know exactly what you wanted from your post (as someone not familiar with R and therfore your example).