Indexing Copernicus Global Land Service Data into ODC – How to Use NetCDF in Open Data Cube

copernicusindexingnetcdfopen-data-cube

I would like to index Copernicus Global Land Service data (that we can find in this site: https://land.copernicus.vgt.vito.be/manifest/) into Open Data Cube.

How can I do it without the need to download data locally?

I did it by download data locally, but I need to do it without the need to download data locally.

Best Answer

You may use the eodatasets3 library to index data hosted remotely, I've done it by indexing official Copernicus Global Land Cover data you can find in this S3 bucket (data access is documented here).

By using the DatasetPrepare object, specifying a remote dataset location, you can call the note_measurement method by passing a remote file (i.e. a GeoTIFF tile), the library takes care of downloading the file, reading the metadata, and generating an EO3 document you can index in ODC, without the need of keeping the file locally afterward.

DATASET_LOCATION = 'https://s3-eu-west-1.amazonaws.com/vito.landcover.global/v3.0.1/2015/E080N60/'
TILE_URI = 'E080N60_PROBAV_LC100_global_v3.0.1_2015-base_Discrete-Classification-proba_EPSG-4326.tif'

p = DatasetPrepare(
   dataset_location=DATASET_LOCATION,
   allow_absolute_paths=True
)
...
p.note_measurement(measurement_title, DATASET_LOCATION + TILE_URI)
dataset_doc = p.to_dataset_doc()
serialise.to_doc(dataset_doc)

Notice the usage of allow_absolute_paths as specified in the docs.
Unfortunately, the library doesn't work for multi-band files (see issue), in that case, you have to implement a preparation script to create the EO3 documents, but I'm sure this doesn't prevent you to reference remote datasets in the EO3. Hope this helps.

Related Question