[GIS] Using python GDAL to reproject an HDF

coordinate systemgdalgdalwarpgeotiff-tiffpython

I am working with MODIS data (MOD14) and I only need the fire mask, which is the first dataset.

The problem is that MODIS data in the sinusoidal projection and I need to reproject into something else.

My code reads/write out the original (unprojected) data and now I am trying to write out the projected data but am having difficulties.

I am pretty new to python and GDAL so I am not sure what I've been doing wrong. I got the reprojection parameters from this answer on StackExchange but I am having trouble converting it to the python version.

I am not sure if I am not writing it correctly or perhaps the srcSRS option is incorrect. I got the srcSRS information from SpatialReference.org here SR-ORG:6974

file_name = 'MOD14.A2018304.0525.006.2018304093408.hdf'

hdf_file = gdal.Open(file_name)
#print(hdf_file.GetMetadata()) 
#hdf_file.GetSubDatasets()  #fire mask is the first subdataset


band = gdal.Open(hdf_file.GetSubDatasets()[0][0], gdal.GA_ReadOnly)
arr = band_ds.ReadAsArray()  #now numpy array

driver = gdal.GetDriverByName("GTiff")
rows,cols = band_array.shape
dtype=gdal.GDT_Byte  # fire mask is an 8-bit unsigned integer
outname = 'Mod14_org.tif'

#write out unprojected geotiff
dataset_out = driver.Create(outname, cols, rows, 1, dtype)   # fire mask is an 8-bit unsigned integer
dataset_out.GetRasterBand(1).WriteArray(arr)
dataset_out.FlushCache()
dataset_out = None


#MODIS is uses the Sinusoidal projection, now reproject it

ds = gdal.Warp('MOD14_warp.tif', 'Mod14_org.tif', dstSRS='EPSG:4326',
               srcSRS='+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs ', outputType=gdal.GDT_Byte)
ds=None

Best Answer

The main problem I see is that GDAL does not recognize the CRS/Transform information from the dataset. As such, transforming the dataset is not possible as the original information cannot be detected.

So, you need to construct the CRS/Transform yourself.

Step 0: Get the CRS of the dataset

Based on the link you gave, this should be the CRS of the sinusoidal projection:

# https://spatialreference.org/ref/sr-org/modis-sinusoidal-3/
crs_str = "+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs"

Step 1: Open the dataset to get the dimensions and metadata. I used rioxarray to open it up and select the variable. Note: rioxarray uses GDAL under the hood.

import rioxarray

rds = rioxarray.open_rasterio(
    "MOD14.A2019354.1155.006.2019354142314.hdf",
    variable=["0"],
    parse_coordinates=False,
)

Step 2: Get the transform of the raster

On inspection, I used the lat/lon bounding coordinates to find the top/left coordinate needed to construct the transform:

<xarray.Dataset>
Dimensions:  (band: 1, x: 1354, y: 2030)
Coordinates:
  * band     (band) int64 1
Dimensions without coordinates: x, y
Data variables:
    0        (band, y, x) uint8 ...
Attributes:
    ANCILLARYINPUTPOINTER.1:            MOD03.A2019354.1155.006.2019354183746...
    ...
    NORTHBOUNDINGCOORDINATE:            0.652119162677686
    ...
    WESTBOUNDINGCOORDINATE:             -36.2435873100965

The next step is to transform it from latlon to sinusoidal. I used pyproj to do that:

from pyproj import Transformer

transformer = Transformer.from_crs("EPSG:4326", crs_str, always_xy=True)
west, north = transformer.transform(rds.WESTBOUNDINGCOORDINATE, rds.NORTHBOUNDINGCOORDINATE)

The docs say the resolution is 1km or 1000m, so I used that as the pixel size and constructed the transform:

from affine import Affine

pixel_size = 1000 # 1 km
transform = Affine(pixel_size, 0, west, 0, -pixel_size, north)

Step 3: Generate the coordinates of the raster

Using the transform and dimensions of the raster, I generated the coordinates:

from rioxarray.rioxarray import affine_to_coords

coords = affine_to_coords(transform, rds.rio.width, rds.rio.height)
rds.coords["x"] = coords["x"]
rds.coords["y"] = coords["y"]

Step 4: Write the CRS to the dataset

rds.rio.write_crs(crs_str, inplace=True)

Step 5: Reproject dataset

rds4326 = rds.rio.reproject("EPSG:4326")

Step 5: Write dataset to GeoTiff

rds4326["0"].rio.to_raster("fire_mask.tif")

UPDATE:

For some reason, a regular reproject didn't work. But reprojecting by left half and right half got something output that wasn't nodata:

rdssub = rds.isel(x=slice(rds.rio.width // 2, None))
rds4326 = rdssub.rio.reproject("EPSG:4326")
rds4326["0"].rio.to_raster("right_fire_mask.tif")
rdssub = rds.isel(x=slice(None, rds.rio.width // 2))
rds4326 = rdssub.rio.reproject("EPSG:4326")
rds4326["0"].rio.to_raster("left_fire_mask.tif")

enter image description here

Related Question