[GIS] How to speed up raster2pgsql

dempostgisrasterraster2pgsql

I'm trying to upload a 37 GB raster dataset to PostGIS. The raster is out-of-db (-R option), so the actual pixel values are not stored in PostGIS. I'm using this command:

raster2pgsql -s 25833 -C -I -R -t auto /data/norway/terrain/10m/*.dem dem.10m | psql -h localhost -U postgres -p 5432 -d norway

Everything works fine, but it's veeery slow: There are 254 raster files in my folder, and in 24 hours only 8 are added. With this speed it will take me a month to upload the dataset!

Is there any way I could speed up the process?

Best Answer

I came across this question while looking up ways to speed up ingestion of raster tiles as well. I wrote a script using Gnu Parallel to multi-thread the process, which sped up the ingestion of about 160gb of tif raster tiles from days to about 6 hours.

#!/bin/sh

listOfTiles() {
    for i in $(find . -type f -iname '*.tif'); do
        if ! grep -qi ${i##*/} copied; then
            rasters+=("$i")
            echo "adding $i"
        else
            echo "skipping $i"
        fi;
    done
}

uploadRasters() {
    parallel --jobs 200% --bar 'raster2pgsql -Y -b 1 -s 4326 -a -F -t 40x40 {} dem | psql && echo "{/} $(date)" >> copied' ::: $1
}

Even on my low-power xeon hardware, the performance increase in notable. Note also that no constraints or indexes are created during ingestion and that the tile dimensions are manually specified. Copy is used instead of insert, which seemed to prevent a deadlock.

Related Question