I have a raster that I am trying to compress and convert to a NetCDF
format with compression level = 9 using the xarray
package. I assume that the compression is added using the encoding parameter as a dict, but I am not quite sure how I understand what to add completely here:
f = directory + "/D_Passaic_F02_NBR_E0001_WGS84_comp"
t = xarray.open_rasterio(f)
encode = {'zlib': True, 'complevel': 9}
t.to_netcdf(output_dir+"/Test2.nc", encoding=encode)
I basically get an error: KeyError: 'zlib'
, but I am not sure what I am suppose to use here. Suggestions?
The data array shows a single band, and an x, and y variable like this:
<xarray.DataArray (band: 1, y: 9635, x: 14564)>
or, in more detail:
<bound method ImplementsArrayReduce._reduce_method.<locals>.wrapped_func of
<xarray.DataArray (band: 1, y: 9635, x: 14564)>
[140324140 values with dtype=float32]
Coordinates:
* band (band) int32 1
* y (y) float64 41.06 41.06 41.06 41.06 ... 40.74 40.74 40.74 40.74
* x (x) float64 -74.45 -74.45 -74.45 -74.45 ... -73.97 -73.97 -73.97
Attributes:
transform: (3.2670488250568696e-05, 0.0, -74.447024371179...
crs: +init=epsg:4326
res: (3.2670488250568696e-05, 3.2670488250568696e-05)
is_tiled: 1
nodatavals: (-9999.0,)
scales: (1.0,)
offsets: (0.0,)
AREA_OR_POINT: Area
HISTOGRAM: 9090|9307|9097|9209|8729|8864|8744|8864|9181|9...
TIFFTAG_ARTIST: HEC-RAS
TIFFTAG_IMAGEDESCRIPTION: Depth (Max)>
Best Answer
The error hints at xarray trying to find a variable called "zlib" in your data. The correct structure for the encoding dict would be something like:
But due to the way the data was loaded this is tricky.
What you have after loading a file using
open_rasterio
is aDataArray
. ADataArray
does not have a structure with variables. When writing to NetCDF your data needs to be aDataSet
.If you call
to_netcdf
on aDataArray
, it will automatically generate aDataSet
with a variable for the data called "__xarray_dataarray_variable__
". So this will work but the resulting file will be ugly with a name like that...:When you built a DataSet manually you can specify the variable name. For example if your band contains DEM data, something like this might make sense:
Afterwards you can specify the encoding dict for that and have a reasonable name in the result:
Sadly compression with xarray is very RAM hungry, watch out for your unsaved work before you run it in case your system OOMs.
References: