Geodataframe Rasterio – Adding Raster Pixel Information to Existing Geodataframe with One Column per Band

geodataframerasterrasterio

I have a land cover dataset (a geopandas.GeoDataFrame) with a series of "buffers" over a territory.
The dataset has roughly this format:

       unit_id ,    ...,     geometry,
30     58001214,    ...,     POLYGON ((-71.91521 -40.73168, -71.91521 -40.7...,
56     26002238,    ...,     POLYGON ((-72.09765 -42.87532, -72.09765 -42.8...,
63     26002231,    ...,     POLYGON ((-72.00695 -42.25299, -72.00696 -42.2...,

Each polygon being the "buffer" which is a little circle of a couple of metters long. Aproximately 30m, 3×3 pixels in my raster's resolution

I'd like to add columns with the pixel values from a raster image, each column being a "band" in the raster.

       unit_id ,    b1,  b2,  b3     geometry,
30     58001214,    255, 255, 255     POLYGON ((-71.91521 -40.73168, -71.91521 -40.7...,
56     26002238,    255, 255, 255     POLYGON ((-72.09765 -42.87532, -72.09765 -42.8...,
63     26002231,    255, 255, 255     POLYGON ((-72.00695 -42.25299, -72.00696 -42.2...,

Now, I've already made the mask using rasterio with the dataset geometry over the raster:

out_image, out_transform = rasterio.mask.mask(dataset=src, shapes=grid_samples.geometry)

This returns the information I want, and has a matrix per band:

    array(
        [
            [[0, 0, 0, ..., 0, 0, 0],
             [0, 0, 0, ..., 0, 0, 0],
             [0, 0, 0, ..., 0, 0, 0],
             ...,
             [0, 0, 0, ..., 0, 0, 0],
             [0, 0, 0, ..., 0, 0, 0],
             [0, 0, 0, ..., 0, 0, 0]],
        ...,
            [[0, 0, 0, ..., 0, 0, 0],
             [0, 0, 0, ..., 0, 0, 0],
             [0, 0, 0, ..., 0, 0, 0],
             ...,
             [0, 0, 0, ..., 0, 0, 0],
             [0, 0, 0, ..., 0, 0, 0],
             [0, 0, 0, ..., 0, 0, 0]],
        ], dtype=uint16),
    Affine(
        0.00026949458523585647, 0.0, -57.0323305897086,
        0.0, -0.00026949458523585647, -27.977849860845676
    )
)
len(out_image)
>>> 26   # the right number of bands on my raster.

But it looks the matrix index doesn't match the dataframe order. How can I "merge" the data back to the original dataframe?

Best Answer

I kind of got what I wanted applying a function to get the mean out of each band matrix.

raster = rasterio.open('myplace.tiff')
samples = gpd.read_file("samples.gpkg")

def _do_extract(row, raster):

    out_image, out_transform = rasterio.mask(raster, [row.geometry], crop=True)
    bands = raster.descriptions

    for index, item in enumerate(bands):            
        value = out_image[index]
        row[item] = value.mean()
    return row


result = samples.apply(lambda x: _do_extract(x, raster), axis=1)

I guess it's not optimal. Suggestions on improvement are welcome.

Related Question