[GIS] Reading raster files by block with rasterio

gdalpythonrasterrasterio

I find rasterio much easier to use that plain ogr/gdal, so I'd prefer using it. However, I need to process large rasters, so reading rasters by block as explained in the Python GDAL course is important for me.

Does rasterio support this or should I handle the large, tilable rasters with GDAL instead?

Best Answer

Reading Rasters by block can be done in rasterio and I'd argue it's easier than in GDAL. There is even a tutorial on windowed read/write over at GitHub.

Let's take a look at the read functions arguments, which allows you to set a window to read data from:

def read(self, indexes=None, out=None, window=None, masked=False,
        boundless=False):
"""Read raster bands as a multidimensional array
    Parameters
    ----------
    window : a pair (tuple) of pairs of ints, optional
        The optional `window` argument is a 2 item tuple. The first
        item is a tuple containing the indexes of the rows at which
        the window starts and stops and the second is a tuple
        containing the indexes of the columns at which the window
        starts and stops. For example, ((0, 2), (0, 2)) defines
        a 2x2 window at the upper left of the raster dataset.
"""

So all you need now is a way to create the window tuples. That's what rasterios block_windows function is for:

def block_windows(self, bidx=0):
    """Returns an iterator over a band's block windows and their
    indexes.
    [....]
    The primary use of this function is to obtain windows to pass to
    read_band() for highly efficient access to raster block data.
    """

So in your case I guess the code would look something like this:

with rasterio.open("your/data/geo.tif") as src:
     for block_index, window in src.block_windows(1):
         block_array = src.read(window=window)
         result_block = some_calculation(block_array)