Reduce file size of B&W GeoTIFFs

georeferencinggeotiff-tiffqgis

I am working with a lot of old black and white maps extracted from scanned PDFs. When I use the QGIS georeference the resulting files are much larger than the original 10x to 100x. Is there a way to prevent this. The starting TIFFs are true black and white with 1 bit per pixel encoding (CCITT Group 4 Fax Encoding) but the resulting files are greyscale with 16 bits. Is there a way to get QGIS to make and work with true B&W TIFFs?

Best Answer

The right parameters for the compression that you want to use are documented in https://gdal.org/drivers/raster/gtiff.html and they are

-co COMPRESS=CCITTFAX4 -co NBITS=1 

By the documentation "The apparent pixel type should be Byte" and that can be forced by the gdalwarp parameter -ot byte https://gdal.org/programs/gdalwarp.html#cmdoption-gdalwarp-ot.

The problem with the QGIS georeferencer is that if exposes only a small subset of the GDAL options in the GUI. For example, user can select only these compression methods:

enter image description here

What you can do is to generate the GDAL commands with the QGIS georeferencer and edit them manually.

An example: QGIS creates commands like this:

gdal_translate -of GTiff -gcp 1046.753 2423 49483.801 661482.876 -gcp 4941.956 5306.063 49819.728 661241.55 -gcp 1905.538 5336.734 49387.426 661167.8 "input.tif" "output.tif"

gdalwarp -r near -order 1 -co COMPRESS=DEFLATE  -t_srs EPSG:2053 "output.tif" "input_modified.tif"

Edited commands:

gdal_translate -of VRT -gcp 1046.753 2423 49483.801 661482.876 -gcp 4941.956 5306.063 49819.728 661241.55 -gcp 1905.538 5336.734 49387.426 661167.8 "input.tif" "output.vrt"

gdalwarp -r near -order 1 -ot byte -co COMPRESS=CCITTFAX4 -co NBITS=1  -t_srs EPSG:2053 "output.vrt" "input_modified.tif"

I changed also the file format of the gdal_translate output into virtual raster (VRT) https://gdal.org/drivers/raster/vrt.html. That is not necessary but there is no benefit at all in writing a physical temporary TIFF file.

I made a test with some real data and the warped output is a 1-bit binary image. Gdalinfo reports such images like this:

Image Structure Metadata:
  COMPRESSION=CCITTFAX4
  INTERLEAVE=BAND
...
Band 1 Block=20186x3 Type=Byte, ColorInterp=Palette
  Image Structure Metadata:
    NBITS=1
  Color Table (RGB with 2 entries)
    0: 255,255,255,255
    1: 0,0,0,255
Related Question