I am using gdalwarp
to convert GeoPDFs of processed maps to GeoTIFFs (for later stitching into a larger GeoTIFF) using the following command:
gdalwarp -t_srs EPSG:28356 -r cubic -cutline "nsw_map_boundaries\20160506_nsw_map_bounds.geojson" -cwhere "name = '9030-4S SPRINGWOOD'" -crop_to_cutline -dstalpha "9030-4S SPRINGWOOD.pdf" "9030-4S SPRINGWOOD.tif"
The GeoPDFs have collars, so the cutline file contains the boundaries of the actual maps. EPSG:28356 is the projection of the map (GDA94 / MGA Zone 56).
Unfortunately this approach turns a 10MB PDF into a 70MB GeoTiff! The warping also re-orients the map to align with the UTM grid.
The main reason for the size is that the output GeoTIFFs are in 32-bit format. The original PDF files only have around 30 distinct colours (see below), so it would be more efficient if the GeoTIFFs were in 8-bit paletted colour. I haven't been able to find a flag or setting to do this.
Is there a way of achieving this – either with gdalwarp
, or other GDAL tools (or both)?
One constraint is that the GeoTIFFs do need to have transparency – either via alpha, or via a NoData value – for anything outside the cutline. The current gdalwarp
command uses alpha (-dstalpha flag), but only because I couldn't easily get a NoData value to work.
Sample PDF file available from the NSW Topo Map Portal: https://portal.spatial.nsw.gov.au/download/NSWTopographicMaps/DTDB_GeoReferenced_Raster_CollarOn_161070/2017/25k/9030-4S+SPRINGWOOD.pdf
Sample cutline file with all map boundaries can be downloaded from https://maps.ozultimate.com/wiki/downloads (direct link)
Best Answer
You need to specify the output compression type. Using (lossy) JPEG will get you a much smaller output tif (~12MB).
Alternatively, a 3 stage process will get you a paletted (~9MB) image. There is a manual step, you'll need to figure out what value is outside the clipline after converting to palletted with
rgb2pct.py
to assign it to NoData.