[GIS] Turning shapefile into mask and calculating mean

gdalmaskingpython

I have been using Python to process huge arrays stored as NetCDF files. I would like to calculate the mean of an area defined by a shapefile.
I have just installed GDAL but if there are other tools I should use please let me know.

So my main question is how can I turn my shapefile into a mask?

Then use my mask to calculate the NumPy mean of my array within the area of the mask?

shp="country.shp"
array=mynumpyarray
mask=useGDALtochangeshapefiletomask
meanofarea=N.mean(array,limits=mask)

I have been able to convert my shapefile to a raster using gdal_RasterizeLayer link supplied and then a masked array using

maskarray=mask_ds.GetRasterBand(1).ReadAsArray()

Now it is just a question of matching my NetCDF extent with my raster/maskarray. Here are my extents as requested.

my shapefile/raster extent: x_min, x_max, y_min, y_max 140.962408758 149.974994992 -39.1366533667 -33.9813898583

my netcdf file extent: min longitude: 139.8 max longitude: 150.0 min latitude -39.2 max latitude: -33.6 LAT size 106 LON size 193

Best Answer

You are looking for the gdal.RasterizeLayer function.

You could then use ReadAsArray to turn the rasterized polygon into a numpy array.

Based on your NetCDF file extent and rows/columns, the following code should generate you a numpy 0-1 mask that matches the NetCDF exactly.

shapefile=r'whatever your shapefile path is'
xmin,ymin,xmax,ymax=[139.8,-39.2,150.0,-33.6] #Your extents as given above
ncols,nrows=[193,106] #Your rows/cols as given above
maskvalue = 1

xres=(xmax-xmin)/float(ncols)
yres=(ymax-ymin)/float(nrows)
geotransform=(xmin,xres,0,ymax,0, -yres)

src_ds = ogr.Open(shapefile)
src_lyr=src_ds.GetLayer()

dst_ds = gdal.GetDriverByName('MEM').Create('', ncols, nrows, 1 ,gdal.GDT_Byte)
dst_rb = dst_ds.GetRasterBand(1)
dst_rb.Fill(0) #initialise raster with zeros
dst_rb.SetNoDataValue(0)
dst_ds.SetGeoTransform(geotransform)

err = gdal.RasterizeLayer(dst_ds, [1], src_lyr, burn_values=[maskvalue])

dst_ds.FlushCache()

mask_arr=dst_ds.GetRasterBand(1).ReadAsArray()
Related Question