Python Rasterio – How to Perform Masking with Rasterio in Python

maskingpythonrasterio

Introduction

I have a shapefile for my area of interest (which in this case is Tehran) and also a TIF file that contains the categorical information(cloud status). In this case, I want to mask the TIF file and clip my area of interest (Tehran) using the shapefile provided.

This TIF file just contains numbers between 0 and 5. Therefor if I run the below code:

with rasterio.open("SVDNB_npp_d20120301.vcld.tif") as src:
  print(src.read(1))

I will have an output like:

[[0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [4 4 4 ... 4 4 4]
 [4 4 4 ... 4 4 4]
 [0 0 0 ... 0 0 0]]

I am using the below code to mask the TIF file(The few first lines are just to change coordinate references system) using the shapefiles provided:

import fiona
import rasterio
import rasterio.mask
from rasterio.windows import Window
from pyproj import Transformer
import geopandas as gpd

# To Change the CRS
df = gpd.read_file('tehran.shp').to_crs(4326)
df.to_file("tehran.shp")

# To mask the data
with fiona.open("tehran.shp", "r") as shapefile:
  shapes = [feature["geometry"] for feature in shapefile]
with rasterio.open("SVDNB_npp_d20120301.vcld.tif") as src:
  out_image, transformed = rasterio.mask.mask(src, shapes, crop=True, filled=True)
  out_meta = src.meta

Issue

The problem is that the masked variable, which in this case is represented by out_image variable, is supposed to contain just numbers between 0 to 5 but, unfortunately, whenever I call this variable name, it outputs is like:

array([[255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       ...,
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255]], dtype=uint8)

I am confused and not sure why the numbers now are changed to color-bits.

Best Answer

Your code works fine. If you look at SVDNB_npp_d20120301.vcld.tif (src.meta) you'll see nodata = 255 and you specified filled=True which means fill out_image with nodata values. So when you print your array, you are seeing the nodata values around the masked area of actual values.

If you write out the image and load it in a GIS, you'll see:

# To mask the data
with fiona.open("tehran.shp", "r") as shapefile:
    shapes = [feature["geometry"] for feature in shapefile]

with rasterio.open("SVDNB_npp_d20120301.vcld.tif") as src:
    out_image, transformed = rasterio.mask.mask(src, shapes, crop=True, filled=True)
    out_profile = src.profile.copy()

out_profile.update({'width': out_image.shape[2],'height': out_image.shape[1], 'transform': transformed})
with rasterio.open("tehran.tif", 'w', **out_profile) as dst:
    dst.write(out_image)

enter image description here

If you just want a 1D array of 0-5, just filter out the nodata

out_image[out_image<=5]]