[GIS] Subsetting GeoTIFF with Python

gdalgeotiff-tiffnumpypython

I'm sub-setting a GeoTIFF with the GDAL Python binding and numpy. Essentially, I am providing a central location (coordinates of a flux tower) from which I produce a subsetted image. I then want to write this to a new GeoTIFF. However, my current code does not correctly georeference. I'm having a hard time digesting the rather sparse and confusing GDAL documentation and so would love for some help. I have read that this might be possible with a gdal_translate or a VRT, but can not figure out any more from the docs.

So, here is how I load the GeoTIFF in question and produce the subset:

datafile = gdal.Open("%s" % filenames_matched[site][filen] )
data = datafile.ReadAsArray()
info = datafile.GetGeoTransform()       
# GDAL provides the top-left coordinate (lat, lon)
#Site_lon and site_lat are the coordinates for the fluxtower site.
# Produce a relative coordinate for the fluxtower (eg relative to gdal ul(lat,lon))
lon_rel = site_lon - info[0]
lat_rel = site_lat - info[3]

#Get pixel size (resolution)
res_we = abs(info[1])
res_ns = abs(info[5])   

#Now convert the relative position from degrees to pixels..
pixel_lon = float ( lon_rel) / res_we
pixel_lat = float ( lat_rel ) / res_ns

# Will, of course, have to round this to the nearest whole pixel...
pixel_lon = round( pixel_lon )
pixel_lat = round( pixel_lat )

# Now need to outline the actual buffer of 24 on each side...
buffer_lon_east = pixel_lon + 24.5
buffer_lon_west = pixel_lon - 24.5
buffer_lat_north = pixel_lat + 24.5
buffer_lat_south = pixel_lat - 24.5

#Produce subset array
subset = data[buffer_lon_west:buffer_lon_east,buffer_lat_south:buffer_lat_north]

And then save the new subsetted GeoTIFF:

Nx, Ny = subset.shape
driver = gdal.GetDriverByName("GTiff")

projection = datafile.GetProjection()    # Same projection as input (WGS84)
ds = driver.Create("%s" % composited_filename, Nx, Ny, 1, gdal.GDT_Float32)
ds.SetProjection(projection)

# Where I imagine the problem is:
# Sets new coordinates from derived ul lat, lon for the subset 
ds.SetGeoTransform([info[0] + lon_rel, info[1], info[2], info[3] + lat_rel, info[4], info[5])

ds.GetRasterBand(1).WriteArray(subset)
ds = None

Best Answer

You can indeed use gdal_translate either giving the source coordinates ([-srcwin xoff yoff xsize ysize]) or the georeferenced coordinates ([-projwin ulx uly lrx lry]). For instance:

import os
inDS = ... # input raster
outDS = ... # output raster
lon = ... # lon of your flux tower
lat = ... # lat of your flux tower
ulx = lon - 24.5
uly = lat + 24.5
lrx = lon + 24.5
lry = lat - 24.5
translate = 'gdal_translate -projwin %s %s %s %s %s %s' %(ulx, uly, lrx, lry, inDS, outDS)
os.system(translate)
Related Question