[GIS] Very slow loading of PostGIS raster layer in qGIS

postgispostgresqlqgisrasterraster2pgsql

I'm trying to use big raster dataset with PostGIS 2.5 on PostgreSQL 11 on Linux for the purpose of serving a base map organization-wide. My dataset is 2TB of TIFFs. I created 8GB test table (including pyramids) from TIFFs with original resolution of 10240×10240, tiled at 200×200, 8 levels of pyramids, default constraints and index created by the raster2pgsql tool using command:

raster2pgsql -d -I -C -Y -N 0 -l 2,4,8,16,32,64,128,256 -s 32635 -M -P -t 200x200 *.tif rasters | psql -h -U postgres -d testdb

My issue is with loading speed of the layer in QGIS. Every time I load the layer or open saved QGIS project with the raster layer it request layer statistics or something from PostGIS that produces full table scan. With my 8GB test database it take about 5 minutes to load. With 30GB test data it took several hours to load. I tested with different tiles from 32×32 to 2048×3048 pixels in sizes, but with minor differences in loading time. The data is in UTM.

The query that is executed by QGIS is:

select srid, nbband, st_xmin(geom) as xmin, st_xmax(geom) as xmax, st_ymin(geom) as ymin, st_ymax(geom) as ymax, scale_x, scale_y from (select st_srid("rast") srid, st_extent("rast"::geometry) geom, max(ST_NumBands("rast")) nbband, avg(ST_ScaleX("rast")) scale_x, avg(ST_ScaleY("rast")) scale_y from "rasters" group by st_srid("rast")) foo;

using explain I get:

"Subquery Scan on foo (cost=58862.06..192304.46 rows=162240 width=56)"
" -> GroupAggregate (cost=58862.06..187437.26 rows=162240 width=89)"
" Group Key: (st_srid(rasters.rast))"
" -> Sort (cost=58862.06..59267.66 rows=162240 width=580)"
" Sort Key: (st_srid(rasters.rast))"
" -> Seq Scan on rasters (cost=0.00..14717.00 rows=162240 width=580)"

explain analyze returns:

"Subquery Scan on foo (cost=58862.06..192304.46 rows=162240 width=56) (actual time=208240.590..208240.591 rows=1 loops=1)"
" -> GroupAggregate (cost=58862.06..187437.26 rows=162240 width=89) (actual time=208240.588..208240.588 rows=1 loops=1)"
" Group Key: (st_srid(rasters.rast))"
" -> Sort (cost=58862.06..59267.66 rows=162240 width=580) (actual time=68488.704..68588.816 rows=162240 loops=1)"
" Sort Key: (st_srid(rasters.rast))"
" Sort Method: external merge Disk: 94512kB"
" -> Seq Scan on rasters (cost=0.00..14717.00 rows=162240 width=580) (actual time=19.640..68370.097 rows=162240 loops=1)"
"Planning Time: 0.056 ms"
"Execution Time: 208243.197 ms"

Any idea how to load the raster without this delay?

Best Answer

As User30184 pointed out in the comments I was missing the -r switch.

After adding it I got the regular blocking working for all the pyramid layers, but not for the main layer as it was reported by raster2pgsql to be bigger than the maximum 65535x65535 pixels. With some more digging I found out that there is a ticket about this problem, which is planed to be fixed in PostGIS 3.0.

Using:

SELECT AddRasterConstraints('public', 'rasters'::name, 'rast'::name);

PostGIS created constraints that are appropriate for my data.

QGIS was now loading the raster quicker, but still it took a long time. Checking the raster_columns table I noticed that the "extent" column is NULL for all the tables. Using SQL commands:

ALTER TABLE rasters VALIDATE CONSTRAINT enforce_max_extent_rast;
ALTER TABLE o_2_rasters VALIDATE CONSTRAINT enforce_max_extent_rast;
ALTER TABLE o_4_rasters VALIDATE CONSTRAINT enforce_max_extent_rast;
ALTER TABLE o_8_rasters VALIDATE CONSTRAINT enforce_max_extent_rast;
ALTER TABLE o_16_rasters VALIDATE CONSTRAINT enforce_max_extent_rast;
ALTER TABLE o_32_rasters VALIDATE CONSTRAINT enforce_max_extent_rast;
ALTER TABLE o_64_rasters VALIDATE CONSTRAINT enforce_max_extent_rast;
ALTER TABLE o_128_rasters VALIDATE CONSTRAINT enforce_max_extent_rast;
ALTER TABLE o_256_rasters VALIDATE CONSTRAINT enforce_max_extent_rast;

PostGIS calculated the extents and QGIS loaded my rasters table for 1 second!

Thank you User30184 for the hint about -r and for the link to the documentation.