[GIS] How to create mask bands in GDAL and Python

gdalmaskingpython

I have small program in Python that uses gdal and numpy to perform a logarithmic scale of some data into the RGB 0-255 range.

Now this input data has a nodata value set to mask the valid values, and I would like to have the output to retain that mask.

Obviously I can't use nodata for my output as I am ussing the full byte range for colour values.

From what I can tell from the GDAL API I need to create a mask band in the output file.

I added the code to do this but suddenly the program started running really slow and the output file size became huge.

What it the best way to do this?

Here is my code:

#!/usr/bin/env python

import gdal
from gdalconst import *
import numpy
import sys

infile = sys.argv[1]
outfile = sys.argv[2]
format = 'GTiff'
type = GDT_Byte

indataset = gdal.Open(infile, GA_ReadOnly)

gdal.SetConfigOption('GDAL_TIFF_INTERNAL_MASK', 'YES')
out_driver = gdal.GetDriverByName(format)
outdataset = out_driver.Create(outfile, indataset.RasterXSize, indataset.RasterYSize, indataset.RasterCount, type)
outdataset.CreateMaskBand(gdal.GMF_PER_DATASET)

gt = indataset.GetGeoTransform()
outdataset.SetGeoTransform(gt)

prj = indataset.GetProjectionRef()
outdataset.SetProjection(prj)

for iBand in range(1, indataset.RasterCount + 1):
    inband = indataset.GetRasterBand(iBand)
    inmaskband = inband.GetMaskBand()

    outband = outdataset.GetRasterBand(iBand)
    outmaskband = outband.GetMaskBand()

    for i in range(inband.YSize - 1, -1, -1):
        scanline = inband.ReadAsArray(0, i, inband.XSize, 1, inband.XSize, 1)
        scanline = numpy.multiply(numpy.divide(numpy.log1p(numpy.minimum(10000, numpy.maximum(0, scanline))), numpy.log1p(10000)), 255)
        outband.WriteArray(scanline, 0, i)

        scanline = inmaskband.ReadAsArray(0, i, inband.XSize, 1, inband.XSize, 1)
        outmaskband.WriteArray(scanline, 0, i)

Best Answer

for one particular band, here's how I create a nodata mask.

import gdal
import numpy as np

nodata = -9999#or set it based on the raster nodata value.
r = gdal.Open(rastername)
raster_arr = np.array(r.GetRasterBand(band).ReadAsArray())

nodatamask = raster_arr == nodata

#do your thing and end up with 
#a result_raster that you'd like to save

result_raster[nodatamask] = nodata
#then save.

Of course, if you're doing logarithms and you have negative values, you'd need to make a temporary nodata value.

nodatamask = raster_arr == nodata

raster_arr[nodatamask] = 100000000

#do your thing

result_raster[nodatamask] = nodata
#then save.