[GIS] Windowed writing in rasterio

pythonrasterio

I have two large datasets, RP1 and RP2. I would like to subtract one from the other, however the datasets are extremely large and so exit out with a MemoryError when I try a direct read:

RP1 = rasterio.open(big_file_path)
RP1 = RP1.read(1, masked=True)

RP1 = RP1_dataset.read(1, masked=True)

File "rasterio_io.pyx", line 348, in

rasterio._io.DatasetReaderBase.read MemoryError

Therefore I have tried to resort to windowed reading and writing as explained by the rasterio docs. I wrote the following script:

import numpy
import rasterio


image = numpy.ones((150, 250), dtype=rasterio.ubyte) * 127
RP1_filename = r'RP1.tif'
RP2_filename = r'RP2.tif'

RP2 = rasterio.open(RP2_filename)
profile = RP2.profile
with rasterio.open(RP1_filename) as RP1:
    for block_index, window in RP1.block_windows(1):
        RP1_block = RP1.read(window=window,masked = True)
        RP2_block = RP2.read(window=window, masked = True)
        result_block = RP2_block - RP1_block
        print(result_block.shape)

RP2.close()

Since RP1 and RP2 have the same window schema I took the liberty of indexing into RP2 by the current RP1 block index. This seems to work fine and the result_block variable contains a masked array as expected.

The problem is I don't know how to collect all these arrays properly such that I can write a new output raster with the result values.

Does anyone know how to do that?

Best Answer

You can directly write the result blocks to the big result tiff file via windowed writing.

import rasterio

src_RP1 = rasterio.open(r'RP1_filename.tif')
src_RP2 = rasterio.open(r'RP2_filename.tif')

out_profile = src_RP1.profile.copy()
dst = rasterio.open(r'result.tif', 'w', **out_profile)

for block_index, window in src_RP1.block_windows(1):
    RP1_block = src_RP1.read(window=window, masked=True)
    RP2_block = src_RP2.read(window=window, masked=True)

    result_block = RP2_block - RP1_block
    dst.write(result_block, window=window)

src_RP1.close()
src_RP2.close()
dst.close()

If indeed you want to assemble one big results array in memory you would have to initialize an empty array with the dimensions of RP1 and iteratively assign the block arrays to the respective rows/columns (you can get these from the window).

Related Question