[GIS] How does QGIS open so large raster datasets (about 40GB)

gdalgeotiff-tiffnumpyqgis

I have problem with GDAL library when open a big GeoTiff file, with size about 32000×32000. I cannot use ReadAsArray function because maximum size of numpy array in python. But I'm wondering why QGIS can open that file easily. What is the technique behind?

Best Answer

If QGIS is runnig in a 1000x1000 pixel sized window on your screen there is no need to read all 32000x32000 pixels for showing the map. GDAL tries to read data from the source image so that no data at all is read outsize the bounding box, and if image has overviews the data come from the resolution level that is best suitable for the map resolution. There is always some overhead but if GDAL needs to read 2000x2000 pixels it is still nothing compared to 32000x32000 pixels worth of data.

It depends on the image format and corresponding driver how well the "read-only-what-you-need" principle works. If you have a geotiff that is internally tiled into 256x256 tiles and that contains the overviews (or pyramid layers or reduced resolutions in other names) GDAL can do it pretty well. On the other hand, large PNG and JPEG images are ineffective because the whole image must be decompressed before it is possible to take data from some small region of interest.

Note: One may know that even huge GeoTIFF files which are compressed with JPEG method are not ineffecctive at all. That's true because in this case the TIFF file is tiled and tiles are compressed with JPEG individually. GDAL does need to decompress each tile completely, but because tiles are small with only 256x256 pixels the operation is small and memory usage low.

Read about blocks, windowing and overviews from http://www.gdal.org/gdal_tutorial.html

Related Question