Numpy Rasterio – Creating an In-Memory Dataset from Numpy Array

numpyrasterioresampling

I am reading in a raster using rasterio, and then upsampling the raster as per the example in the documentation:

def upsample_raster(raster):
    return raster.read(
        out_shape=(raster.height * 2, raster.width * 2, raster.count),
        resampling=resampling.bilinear,
    )

This seems to work fine, except that this method returns the data in a numpy array.

My current application workflow includes operations such as masking which takes as input rasterio's DatasetReader class.

Thus, I am searching for a way to resample a raster and get the result as a DatasetReader or, without dumping the data to disk and re-opening the file, convert a numpy array into a valid DatasetReader.

Best Answer

You need to recreate a DatasetReader manually and you can use a MemoryFile to avoid writing to disk.

You can re-use the metadata from the input raster in the DatasetReader, but you'll need to modify the height and width properties and the transform. From documentation:

After these resolution changing operations, the dataset’s resolution and the resolution components of its affine transform property no longer apply to the new arrays.

In the example below note that:

  1. I use a contextmanager so the DatasetReader and MemoryFile objects get cleaned up automatically. This is why I use yield not return in the function
  2. I had to change the order of indexes in raster.read as arrays are (band, row, col) order not (row, col, band) like you used in your snippet.
from contextlib import contextmanager  

import rasterio
from rasterio import Affine, MemoryFile
from rasterio.enums import Resampling

# use context manager so DatasetReader and MemoryFile get cleaned up automatically
@contextmanager
def resample_raster(raster, scale=2):
    t = raster.transform

    # rescale the metadata
    transform = Affine(t.a / scale, t.b, t.c, t.d, t.e / scale, t.f)
    height = raster.height * scale
    width = raster.width * scale

    profile = raster.profile
    profile.update(transform=transform, driver='GTiff', height=height, width=width)

    data = raster.read( # Note changed order of indexes, arrays are band, row, col order not row, col, band
            out_shape=(raster.count, height, width),
            resampling=Resampling.bilinear,
        )

    with MemoryFile() as memfile:
        with memfile.open(**profile) as dataset: # Open as DatasetWriter
            dataset.write(data)
            del data

        with memfile.open() as dataset:  # Reopen as DatasetReader
            yield dataset  # Note yield not return     


with rasterio.open('path/to/raster') as src:
    with resample_raster(src) as resampled:
        print('Orig dims: {}, New dims: {}'.format(src.shape, resampled.shape))
        print(repr(resampled))

Output

Orig dims: (4103, 4682), New dims: (8206, 9364)
<open DatasetReader name='/vsimem/95befda0-2061-4294-982b-20e46f127066.' mode='r'>
Related Question