[GIS] Use subprocess module to run GDAL processes

gdalpythonsubprocess

I am attempting to write a python script that will reproject & merge a series of .tif files then clip them to the boundaries of a .shp file. I have successfully used the subprocess module to do step 1, but am not sure how to use subprocess to run another second operation. Below is an example of what it might look like if I used subprocess/GDAL to reproject one set of files, then reproject a second set in another folder in the same script. Any ideas on why this won't work?

import os, gdal, subprocess, sys

cmd = ['gdalwarp', '-t_srs','+proj=longlat +ellps=WGS84','*.tif','new7.tif']
cmd2 = ['gdalwarp', '-t_srs','+proj=longlat +ellps=WGS84','newfolder/band4.tif','newfolder/band4_r.tif']
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE,stderr=subprocess.PIPE)
stdout,stderr=proc.communicate()
proc1 = subprocess1.Popen(cmd2,stdout1=subprocess1.PIPE,stderr1=subprocess1.PIPE)
stdout1,stderr1=proc.communicate()
exit_code=proc.wait()

I am new to python and subprocess, thus my code may be way off. Any ideas?

Best Answer

Unfortunately, your problem description leaves some desires. You really should specify more clearly what does not work and which error messages you receive and so forth. And usually it is best in case of errors to try one step after the other.

In your case I see some problems that may cause your script to "not work". First, you should specify the correct path to your programs and to your data. Working with relative paths is problematic, since your script may not work under the same shell environment as you do. So, your relative path newfolder/band4.tif may be the reason why the command does not work. Second, you are fiddling too much with python commands you obviously do not understand well enough. If you want to continue to use the subprocess.Popen it's best to read the manual page and understand it.

Anyway, a few tipps now that should get you running.

  • use the full path to gdalwarp, ie. /usr/local/bin/gdalwapr
  • do not use subprocess.Popen, but use subprocess.check_call() instead 1)
  • omit all those proc.communicate() stuff, you probably don't need it
  • '+proj=longlat +ellps=WGS84' is equal to 'EPSG:4326'

1) subprocess.check_call() waits for the completion of the command before it continues in your script. Nice for seeing where your problem lies and vital if you want to treat the same data sequentially. You may substitute subprocess.check_call() with other subprocess commands after your program runs

A little example script:

def warp(args):
    """with a def you can easily change your subprocess call"""
    # command construction with binary and options
    options = ['/usr/bin/gdalwarp']
    options.extend(args)
    # call gdalwarp 
    subprocess.check_call(options)

# just pass the list of commands without the gdalwarp command at the beginning, since it is added automatically
# this should run two commands of gdalwarp sequentially
warp(['-t_srs', 'EPSG:4326', '/home/user/maps/source.tif', '/home/user/maps/target.tif'])
warp(['-t_srs', 'EPSG:3857', '-of', 'AAIGrid', '/home/user/maps/source.tif', '/home/user/maps/target.xyz']))

I have a little script which you can examine. It creates a combined hillshade from SRTM data. It does not run well at the moment, due to other reasons though. There you can have a look how I've organized my workflow for lots of GDAL commands, and Imagemagick too.