I have a 500MB GeoTIFF with 3 bands on disk. I can easily read a very small, 25×25-pixel subwindow using GDAL like so:
gdal_translate -srcwin 600 550 25 25 hugeimage.tif smallwindow.tif
Does this operation have to read the entire large GeoTIFF into memory, or is it able to seek to the appropriate bytes on disk and read only them?
Does the answer change for selecting individual bands, instead of clipping pixels?
Does the answer change if the GeoTIFF is compressed?
Best Answer
It doesn't have to read the whole thing. GDAL is heavily geared to sensible access either line by line or block by block (for tiles). It won't matter if you are selecting bands, and no the answer won't change if it's compressed - but the real details will depend on which compression and which internal tiling is used. See "Reading Raster Data" here
http://www.gdal.org/gdal_tutorial.html
As it says,
The definition for GDALRasterBand::RasterIO() shows the crux of it, where this key method accepts a query for offset, size of window read, and size of the output. The way the file is structured internally will matter - say if it's tiled, since it will depend on whether your query strides multiple tiles or exists in one.
http://www.gdal.org/classGDALRasterBand.html#a30786c81246455321e96d73047b8edf1
I think you'll find GDAL does this with very minimal computation and I/O.