Rasterio – Fixing Flipped/Inverted Y Axis in Xarray Using Rasterio

rasterioxarray

The data I read with xarray with rasterio engine is inverted along Y axis (This is SMAPL4 data). The correct dimension values are y=ds.y*(-1).

ds = xr.open_mfdataset(file_paths, engine="rasterio", chunks=chunks, combine='nested', concat_dim='time', parallel=True)

enter image description here

How can I fix this, without messing up the chunks?

Debug attempt 1

ds = ds.reindex(y=ds.y*(-1))

Result: The chunking over the y-axis disappears. Plus, numeric values in the variable data field somehow disappears.

Debug attempt 2:

ds= ds.isel(y=slice(None, None, -1))

Result: This code just re-selects the grids and does not change the dimension values themselves

Debug attempt 3:

ds= ds.reindex_like(ds_template)

Result: Again, numeric values within precipitation data disappear

Debug attempt 4:
Change engine from rasterio to default (auto-detect).

ds = xr.open_mfdataset(file_paths, chunks=chunks, combine='nested', concat_dim='time', parallel=True)

Result: My file has some issues with the default engine; the data is read as "attributes."

Best Answer

There is a better way to set the coordinates. For the SMAP dataset, you have to assign "cell_lat" and "cell_lon" as the coordinates of the dataset, and use them to plot. I am posting on behalf of kmuehlbauer based on our discussion in the Github issue. Thank you kmuehlbauer!

Import xarray as xr

Use xarray with netcdf engine

# load root group with coordinates
ds_NSIDC_root = xr.open_dataset(os.path.join(input_path, fn), group="/", engine='netcdf4')

# load data from Geophysical_Data group
ds_NSIDC_precip = xr.open_dataset(os.path.join(input_path, fn), group="Geophysical_Data", engine='netcdf4')

# merge groups
ds_NSIDC = xr.merge([ds_NSIDC_root, ds_NSIDC_precip])

# plot
ds_NSIDC.precipitation_total_surface_flux.plot()
# the above is essentially the same as 
# ds_NSIDC.precipitation_total_surface_flux.plot(x="x", y="y")

enter image description here

Use xarray with rasterio engine

ds_NASA_download_rasterio = xr.open_dataset(os.path.join(input_path, fn), engine='rasterio')
ds_NASA_download_rasterio = ds_NASA_download_rasterio.set_coords(["cell_lat", "cell_lon"])
ds_NASA_download_rasterio.Geophysical_Data_precipitation_total_surface_flux[0].plot(y="cell_lat", x="cell_lon")

enter image description here

Note that the rasterio method only works for files directly downloaded from NASA Earth Data, and not preprocessed data on the NSIDC server.