[GIS] Can GDAL extract a raster window from disk without loading the whole file to memory

gdalgdal-translategeotiff-tiff

I have a 500MB GeoTIFF with 3 bands on disk. I can easily read a very small, 25×25-pixel subwindow using GDAL like so:

gdal_translate -srcwin 600 550 25 25 hugeimage.tif smallwindow.tif

Does this operation have to read the entire large GeoTIFF into memory, or is it able to seek to the appropriate bytes on disk and read only them?

Does the answer change for selecting individual bands, instead of clipping pixels?

Does the answer change if the GeoTIFF is compressed?

Best Answer

It doesn't have to read the whole thing. GDAL is heavily geared to sensible access either line by line or block by block (for tiles). It won't matter if you are selecting bands, and no the answer won't change if it's compressed - but the real details will depend on which compression and which internal tiling is used. See "Reading Raster Data" here

http://www.gdal.org/gdal_tutorial.html

As it says,

There are a few ways to read raster data, but the most common is via the GDALRasterBand::RasterIO() method. This method will automatically take care of data type conversion, up/down sampling and windowing.

The definition for GDALRasterBand::RasterIO() shows the crux of it, where this key method accepts a query for offset, size of window read, and size of the output. The way the file is structured internally will matter - say if it's tiled, since it will depend on whether your query strides multiple tiles or exists in one.

http://www.gdal.org/classGDALRasterBand.html#a30786c81246455321e96d73047b8edf1

I think you'll find GDAL does this with very minimal computation and I/O.

Related Solutions

[GIS] Reading subset of GeoTIFF file without having to read whole file first

I was trying to make things more complicated than they needed to be, one simple solution is as follows:

When using ReadAsArray(), the function works as follows:

data = myfile.ReadAsArray(x_offset, y_offset, x_size, y_size)

So if you use:

data = myfile.ReadAsArray(x_index, y_index, 1, 1)

This will read in JUST one pixel.

But the code can be adapted to read in a 3x3 window etc. using:

data = myfile.ReadAsArray(x_index - 1, y_index - 1, 3, 3)

Best Answer

Related Solutions

[GIS] Reading subset of GeoTIFF file without having to read whole file first

Related Question