tl;dr: I can't get gdal_translate to use multiple cores. How to fix?
I am using gdalwarp
followed by gdal_translate
to process a large GeoTIFF by first cropping to a polygon cutline and outputting a virtual raster, then translating the .vrt to a .tif. I have followed suggestions from a few different answers on this site, first I split up the processes into two to enable better compression following this answer about gdalwarp, then I attempted to speed up the performance of gdal_translate following this answer about multithread support for gdal_translate. I am running this on a remote server which has GDAL v2.2.2 installed and the OS is Ubuntu 16.04.6 LTS (Xenial Xerus).
This is my code.
gdalwarp -of vrt -crop_to_cutline \
-cutline ${path}/counties_chesapeake_watershed.gpkg ${path}/bigraster.tif ${path}/clippedraster.vrt
gdal_translate -co compress=LZW -co NUM_THREADS=8 --config GDAL_CACHEMAX 512 \
${path}/clippedraster.vrt ${path}/clippedraster.tif
My issue is that I don't believe that gdal_translate
is using multiple cores, though I've tried to specify this with NUM_THREADS
and also to increase GDAL_CACHEMAX
. This is a very large raster (~12GB, several hundred km extent at 1 m resolution) so it is running extremely slowly. Can anyone help me parallelize the compression done by gdal_translate
so this will run faster?
Best Answer
You're getting a speedup using NUM_THREADS, but only at the compression stage. gdal_translate cannot used multithreading for any function apart from compression.
Probably the GDAL_CACHEMAX command is helping you out more than the NUM_THREADS option.