[GIS] How to convert shapefile to raster and mask using GDAL Python 3

gdalpython

I have a polygon and filelist of data I need to convert my polygon to raster and mask my filelist with this mask and campute the statistics in the area in polygon. I'm using GDAL Python.

I can do it using R like this :

maskraster <- rasterize(polygon, raster, mask = TRUE) 
#calcul mean of area in polygons 
Rasternew <- cellStats(maskraster,stat=mean)

How can I do the same using GDAL in Python ?

Best Answer

I have a script which allows me to clip a raster to the polygon of a shapefile, and then compute statistics on it (including, as in this case) the mean. This example includes the modules pillow (https://pypi.org/project/Pillow/) and shapefile (https://pypi.org/project/pyshp/) on Python as well.

from osgeo import gdal, osr
import shapefile
from PIL import Image, ImageDraw

def world_to_image(lat_value,lon_value,dataset):
    '''
    This function is required for converting from lat/lon values to the corresponding values in the image coordinates.
    '''
    metadata = dataset.GetGeoTransform()
    # metadata is an array with six values: ulx, xres, xskew, uly, yskew, yres

    # The source projection (lat, long coords)
    source = osr.SpatialReference()
    source.ImportFromEPSG(4326)

    # The target projection (projection of the image)
    target = osr.SpatialReference()
    target.ImportFromWkt(dataset.GetProjection())

    # Create the transform - this can be used repeatedly
    transform = osr.CoordinateTransformation(source,target)
    new_point = transform.TransformPoint(long_value,lat_value)

    x_pixel = int( round((new_point[0] - metadata[0]) / metadata[1],0)) #matrix[:,*value*]
    y_pixel = int( round((new_point[1] - metadata[3]) / metadata[5],0)) #matrix[*value*,:]
    return y_pixel,x_pixel

dataset = gdal.Open("path/to/file")
band = dataset.GetRasterBand("band number")
band = band.ReadAsArray()
band = band.astype(float) #usually ensures no weirdness later when doing maths steps

sf = shapefile.Reader("path/to/shapefile")
minX, minY, maxX, maxY = sf.bbox #Get corner coordinates of shapefile

ulX,ulY = world_to_image(maxY,minX,dataset) #Convert shapefile corners to image 
lrX,lrY = world_to_image(minY,maxX,dataset)

pxwidth = int(lrX-ulX) #number of pixels for width of image
pxheight = int(lrY-ulY) #number of pixels for height of image
clip = band[ulY:lrY,ulX:lrX] #clipping original image to shapefile bounds

pixels = []
for p in sf.shape(0).points:
    temp = world_to_image(p[1],p[0],dataset)
    pixels.append((temp[1],temp[0])) #convert the points of the polygon to image coordinates 

rasterpoly = Image.new('L',(band.shape),1) #set up new image in the same shape as clipped raster image
rasterize = ImageDraw.Draw(rasterpoly) 
rasterize.polygon(pixels,0) #cuts empty image to exact shape of shapefile
mask = np.array(rasterpoly) #converts it to numpy array

'''
function which cuts the original raster image to the shape of the shapefile, 
leaving nodata points elsewhere.
'''
clip = gdal_array.numpy.choose(mask, (band, np.nan)).astype(gdal_array.numpy.float)

Apologies for the rather lengthy code snippet - it is cut from a longer body of code but I have tried to cut out everything unnecessary. This code should produce a 2D numpy array which you can then perform your statistical functions on. Apologies if this is not exactly the sort of thing you were looking for - it was quite hard to know exactly what you wanted from your post (as someone not familiar with R and therfore your example).

Related Solutions

[GIS] Turning shapefile into mask and calculating mean

You are looking for the gdal.RasterizeLayer function.

You could then use ReadAsArray to turn the rasterized polygon into a numpy array.

Based on your NetCDF file extent and rows/columns, the following code should generate you a numpy 0-1 mask that matches the NetCDF exactly.

shapefile=r'whatever your shapefile path is'
xmin,ymin,xmax,ymax=[139.8,-39.2,150.0,-33.6] #Your extents as given above
ncols,nrows=[193,106] #Your rows/cols as given above
maskvalue = 1

xres=(xmax-xmin)/float(ncols)
yres=(ymax-ymin)/float(nrows)
geotransform=(xmin,xres,0,ymax,0, -yres)

src_ds = ogr.Open(shapefile)
src_lyr=src_ds.GetLayer()

dst_ds = gdal.GetDriverByName('MEM').Create('', ncols, nrows, 1 ,gdal.GDT_Byte)
dst_rb = dst_ds.GetRasterBand(1)
dst_rb.Fill(0) #initialise raster with zeros
dst_rb.SetNoDataValue(0)
dst_ds.SetGeoTransform(geotransform)

err = gdal.RasterizeLayer(dst_ds, [1], src_lyr, burn_values=[maskvalue])

dst_ds.FlushCache()

mask_arr=dst_ds.GetRasterBand(1).ReadAsArray()

[GIS] how to overlay shapefile and raster

In R you can do

library(raster)
library(rgdal)
r <- raster('raster_filename')
p <- readOGR('shp_path', 'shp_file')
e <- extract(r, p, fun=mean)

e is a vector with the mean of the raster cell values for each polygon.

Best Answer

Related Solutions

[GIS] Turning shapefile into mask and calculating mean

[GIS] how to overlay shapefile and raster

Related Question