I am trying to access this file from S3 using Python's rasterio:
"s3://sentinel-cogs/sentinel-s2-l2a-cogs/36/N/YF/2023/6/S2B_36NYF_20230605_0_L2A/"
I will be using these two bands : B08.tif(NIR band) and B04.tif(Red Band).
This is the Python code I have tried so far:
import rasterio
import geopandas as gpd
from shapely.geometry import mapping
from rasterio.mask import mask
from rasterio.transform import from_origin
# Load Sentinel-2 bands
nir_path = 's3://sentinel-cogs/sentinel-s2-l2a-cogs/36/N/YF/2023/6/S2B_36NYF_20230605_0_L2A/B08.tif'
red_path = 's3://sentinel-cogs/sentinel-s2-l2a-cogs/36/N/YF/2023/6/S2B_36NYF_20230605_0_L2A/B04.tif'
with rasterio.open(nir_path) as nir_src, rasterio.open(red_path) as red_src:
nir = nir_src.read(1).astype('float32')
red = red_src.read(1).astype('float32')
# Calculate NDVI
ndvi = (nir - red) / (nir + red)
# Load and transform the polygon
polygon_path = 'path_to_sample_polygon.geojson'
polygon_gdf = gpd.read_file(polygon_path)
polygon_gdf = polygon_gdf.to_crs("EPSG:32636")
# Convert polygon to a GeoJSON-like format
polygon_geom = polygon_gdf.geometry.iloc[0]
geometry = [mapping(polygon_geom)]
# Mask NDVI with the polygon
masked_ndvi, out_transform = mask(dataset=ndvi, shapes=geometry, crop=True, transform=nir_src.transform)
# Saving the masked NDVI
masked_ndvi_path = 'masked_ndvi.tif'
meta = nir_src.meta
meta.update({"driver": "GTiff", "height": masked_ndvi.shape[1], "width": masked_ndvi.shape[2], "transform": out_transform})
with rasterio.open(masked_ndvi_path, 'w', **meta) as dst:
dst.write(masked_ndvi)
print("NDVI calculation and masking complete.")
Its taking lot of time to read the files. What can I do access a file which is on AWS server so as to do my task. Also the mean-nvdi value is coming above 2500+ which is too much
My task is to do the following:
- Open Sentinel 2 imagery from AWS servers using python rasterio
module. - Read Polygon in python using geopandas/shapely module
whichever you find suitable. (I have the polygon.geoson file already with me.) - Read only subset of the imagery array
which aligns to the polygon region. - Remember Projection system of
both polygon and imagery are different. To perform any action
between the 2 objects(polygon and imagery) they need to be in the
same projection system. - After subset of the Imagery has been
extracted perform the NDVI calculation.
Best Answer
You are reading and thus downloading the entire image for both of those bands (B08.tif and B04.tif).
It is much quicker to only read the subset of those images that overlap with the polygon instead of clipping it out later on after downloading the lot. This is one of the primary use cases for the Cloud-Optimized Geotiff (COG).
The below example uses rasterio windowed reading and took 6.5 seconds total for all 3 polygons in my test dataset.