Python – Managing Memory with Xarray Concat Function in rioxarray

pythonrioxarrayxarray

I'm facing in trouble related to the limit of my RAM. On my PC I have 32GB of RAM that is totaly filled from the concat of some DataArray.

I've create e simple function below to read a single band of Sentinel 2 scene and put all into a list:

dataarray_list = []
for band_code, band_data in tqdm(scene_bands.items()):

    band = rioxarray.open_rasterio(
        filename=band_data["path"],
        chunks={'x': 512, 'y': 512}
    )

    dataarray_list.append(band)

scene_bands is a dictionary with band code, path and description.

Now I'm tring to create a Dataset:

dataset = xarray.concat(dataarray_list, dim='band_code')

Here the error of filled memory:

Process finished with exit code 137 (interrupted by signal 9: SIGKILL)

Is there a way to execute the concat from disk instead of memory?
If you have another approach share it.

Best Answer

I've solved the problem using xarray.open_mfdataset instead of rioxarray.open_rasterio.

    band = xarray.open_mfdataset(
        paths=file_path,
        chunks={'x': 512, 'y': 512},
        parallel=True,
    )

With that function I'm be able to create the final dataset without problems:

dataset = xarray.concat(dataarray_list, dim='band_code')