[GIS] Tiling from complex, multi-step GDAL VRT

gdalpythontilingvrt

I've got 1002 ~100 MB (5000'x5000') .tif files in a state plane (transverse merc) projection comprising a county-wide, 1-ft aerial dataset. After two days, gdal_merge.py is only at 93 of 1002. And when it's done, I'll still need to apply a mask/cutline, then reproject to EPSG:3857 before I can even start tiling. So I'm trying to find a quicker approach..

I'm relatively familiar with the VRT format and this idea of lazy evaluation, but after several rounds of testing, I'm not convinced the workflow I'm pursing is more efficient.

Here's the workflow I'm asking about (skipping the mask/cutline for now):

  • gdalbuildvrt –> merge the 1002 .tif files ————-> outputs mosaic VRT
  • gdalwarp —–> transform to EPSG:3857 ————> outputs web merc VRT
  • ..start tiling against the 2nd-generation VRT..

The tiling script I wrote in Python (which may be a weak-link in its own respects) is definitely capable of reading the VRT. Small-area windows render in a few seconds, still longer than I'd like. But large area windows take quite a long time. As I'm writing on this, I'm waiting on a sample z=12 tile to render.. I bet it's already eaten 20+ minutes.

Ultimately, my goal is to create 256×256 tiles in the OSM scheme, like you'd get out of Mapnik, but using a Python/GDAL-only approach. ..I've deemed this necessary because I can't get Mapnik to properly consume my 2nd-generation VRT.

My python script is creating the OSM grid properly—I sniped some code from a Mapnik utility (generate_tiles.;y), and I'm getting proper cell coordinates for my raster. But I question whether or not my ReadRaster() / WriteRaster() implementation is efficient.. I'll paste in that part; values are hard-coded for my sample z=12 tile..

def getTileData():
    gdal.AllRegister()
    dataset = gdal.Open('C:/xGIS/Aerial/3857/2009_1ft_od_3857.vrt', GA_ReadOnly)
    bands = dataset.RasterCount
    band_type = dataset.GetRasterBand(1).DataType

    memDriver = gdal.GetDriverByName('MEM')
    memDS = memDriver.Create("TEMP", 256, 256, bands, band_type, [])
    memDS.SetGeoTransform( [94490, (256.0/(121069-94490)), 0, 96198, 0, 256.0/(96198-94490)] )

    for b in range(1, bands+1):
        s_band = dataset.GetRasterBand( b )
        t_band = memDS.GetRasterBand( b )
        data = s_band.ReadRaster( 94490, 69618, 121069, 96198, 256, 256, band_type )
        t_band.WriteRaster( 0, 0, 256, 256, data )

    createDriver = gdal.GetDriverByName('PNG')
    createDS = createDriver.CreateCopy("C:/xGIS/RichlandCounty/Aerial/SCRich09_1ft/3857/tiles/snap_2.png", memDS)
    createDS = None
    memDS = None

Does anyone have any insights or reactions that might help me get results quicker?

Best Answer

Not sure if it helps you, but this was my workflow to tile 2GB of 300 Dutch Topo maps to OSM compatible zoomlevel 15 and 16:

  • Create vrts for each tif and expand indexed colours to RGBA:

for %%N in (D:\Karten\gdal\gdal2tiles\NL25*.tif) DO gdal_translate -of vrt -expand rgba %%N D:\Karten\gdal\gdal2tiles\NL25\%%~nN.vrt

  • Create an index vrt for all files:

gdalbuildvrt -allow_projection_difference index25.vrt NL25*.vrt

  • Use gdal2tiles to define source CRS, create zoomlevel 16, and mosaic that to zoomlevel 15:

gdal2tiles --s_srs EPSG:28992 --zoom 15-16 index25.vrt tiles

I had to make some minor changes to gdal2tiles to enable correct tile numbering, as described here: GDAL2Tiles: MapTiles from BSB/KAP are Switched

It took several days, but python just is not that quick...

Related Question