[GIS] GDAL (Python) Aggregate Raster Into Lower Resolution

gdalpythonraster

I have a (continent-scale) raster with 30m resolution in an Albers equal area projection. Each 30m cell has a value 1-34.

I would like to aggregate information from this raster into a 1km grid such that each 1km grid cell contains a floating-point value which is the number of 30m cells with a value of 2 divided by the total number of 30m cells which fit into the 1km cell.

Is there a way of doing this with GDAL? Or an efficient way to hook GDAL and Python together to perform the calculation (given the large size of the raster, not all of it fits into RAM at once)?

Best Answer

Command-line

First, convert cells equal to 2 to 1, and not equal to 0. Create a second file using gdal_calc.py:

$ gdal_calc.py -A input.tif --calc="A == 2" --outfile equals2.tif

And then to aggregate the averages of a resolution, use gdalwarp with -r average, which does:

average resampling, computes the average of all non-NODATA contributing pixels. (GDAL >= 1.10.0)

$ gdalwarp -tr 30 30 -r average equals2.tif equals2-averaged_30m.tif

Python/GDAL

Within the Python/GDAL API, similar work can be done. The Numpy processing will be similar, except that you might need to process chunks of the input raster if it is larger than your RAM. Look at the source code for gdal_calc.py to get an idea how this can be done, if necessary.

The second step of aggregating is accomplished with GDALReprojectImage, and might look something like:

gdal.ReprojectImage(src_ds, dst_ds, None, None, gdal.GRA_Average)

Related Solutions

[GIS] (How) can I write string data to a raster with Python/GDAL

A Raster Attribute Table (RAT) is the best way to associate string with pixel values. The formats with the best RAT support I know of are ERDAS Imagine (HFA) and KEA (KEA; http://kealib.org/).

The steps I would use are:

Write a raster with integer values (you can use GDAL to write a NumPy array).
Add a RAT containing each pixel value (you can use the Raster GIS module in RSGISLib to do this; http://rsgislib.org/rsgislib_rastergis.html)
Add a new column with the string associated with each pixel value (you can use RIOS to do this; https://bitbucket.org/chchrsc/rios/).

I've written a script to do steps 2 and 3, it's a bit long to post so I've uploaded to https://bitbucket.org/snippets/danclewley/geaX/add-class-name-to-rat

[GIS] R: aggregate raster with ‘mode’ function – how does it work

Ideally, when reducing by a factor, if there is a multimodal result I'd like aggregate to randomly assign the new cell one of the modal values and not always choose the same (if that's indeed what it does).

That is not what it does. See ?modal and the ties argument.

Your question is really about the modal function which you pass on to aggregate (both in package raster). So read the help file of modal and pick the arguments you like to make it behave how you want it to. If you cannot do that, find a better one elsewhere, or write your own.

The default behavior of modal is to break ties randomly, as illustrated here:

set.seed(9)
table(sapply(1:1000, function(i) modal(c(1,1,2,2))))

#  1   2 
#507 493

You have a further question about na.rm stating that

in the top right of the data, that new cell should really be NA as it has 3 smaller NA values.

I suppose you mean is that there are 3 NAs and 1 1 such that NA should be the mode. Perhaps that should be allowed as an option, but currently NA itself cannot be the modal value. The workaround you propose should be OK.