[GIS] Clip and copy files to same directory structure using OGR/GDAL with Python

gdalogrosgeopython

We need to clip all our data (many TB) by a new boundary and save it while maintaining the same data structure. I want to use OGR/GDAL as the data is in shp/tab/kml/geoTif/ecw etc.

I can find the file and create the data structure in python but need to use ogr/gdal to then clip the file and save it to the new directory in the same projection and file type as the input. I have written the process to do this using arcpy but that doesn't work for tab and ecw, so I would prefer to go opensource but am new to this…

So the python is as follows —

import os, glob, sys, shutil, datetime, subprocess
top = r'C:\Program Files (x86)\MapInfo\Offline Dataset\MapInfo_Tables\Property Information' # Source directory for all files to check and process
#top = os.getcwd()
RootOutput = r'C:\junk' # Directory to copy/move files to
RootDirDelete='C:\\' #repatative name of root file path to remove from new directory name
clip_polygon='c:\junk\boundary.shp'
filecount=0
successcount=0
errorcount=0

print "Working in: "+ top

list =[]
f = open(RootOutput+'\\Success_LOG.txt', 'a')
f.write("Log of files Succesfully processed. RESULT of process run @:"+str(datetime.datetime.now())+"\n")
f.close()


for root, dirs, files in os.walk(top, topdown=False):
    for directory in dirs:
        currentPath=os.path.join(root,directory)
        os.chdir(currentPath)
        #arcpy.env.workspace = currentPath
        print os.getcwd()
        lstFCs = glob.glob('*.*')
        print lstFCs
        for fc in lstFCs:
            OutPutDir=RootOutput+"\\"+str(os.getcwd()).strip(RootDirDelete)
            filecount=filecount+1
            if OutPutDir.endswith(".gdb"):
                print "File in GDB...SKIPPING"
            else:
                try:
                    os.makedirs(OutPutDir)
                    print "Created: "+OutPutDir
                except:
                    #pass
                    print "Unexpected error:", sys.exc_info()[0]

##                try:
                fcIn=currentPath+"\\"+fc
                fcOut=OutPutDir+"\\"+fc
                ext=fc[fc.find('.')+1:]
                if ext=='shp':
                    subprocess.call(["ogr2ogr", "-f", "ESRI Shapefile", "-clipsrc", clip_polygon, fcOut, fcIn], shell=True)
                    successcount=successcount+1
                    print "Data clipped: "+fcIn+" to " +fcOut
                else:
                    print 'extension not shp'

I need

  • ogr clip function to clip data by clip_polygon for vector, raster and elevation

    -in arcpy it is like

    #arcpy.Clip_analysis(outPath+'\'+outFileN+'.shp', clip_features, outPath+'\'+outFileN+AOI+'.shp', xy_tolerance)

    -in ogr2ogr as a process I guess the following would work (Error when using gdal/ogr and python to clip a "shp" file by a "shp")

    #subprocess.call(["ogr2ogr", "-f", "ESRI Shapefile", "-clipsrc", clipping_shp, output_shp, input_shp], shell=True)

  • projection should be same as input (usually EPSG 28355 or 4283)

UPDATE: I ran the above process and it gives the following errors, the subprocess when run independently reports a 1 but no file is written…

[Dbg]>>> fcIn
'C:\\Program Files (x86)\\MapInfo\\Offline Dataset\\MapInfo_Tables\\Property Information\\shp\\TRC_DCDB.shp'
[Dbg]>>> fcOut
'C:\\junk\\Program Files (x86)\\MapInfo\\Offline Dataset\\MapInfo_Tables\\Property Information\\shp\\TRC_DCDB.shp'
[Dbg]>>> clip_polygon
'c:\\junk\\boundary.shp'
[Dbg]>>> subprocess.call(["ogr2ogr", "-f", "ESRI Shapefile", "-clipsrc", clip_polygon, fcOut, fcIn], shell=True)
1
[Dbg]>>> 
[Dbg]>>> subprocess.check_call(['gdalinfo', fcIn], shell=True)
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "C:\Python27\ArcGIS10.1\lib\subprocess.py", line 511, in check_call
    raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '['gdalinfo', 'C:\\Program Files (x86)\\MapInfo\\Offline Dataset\\MapInfo_Tables\\Property Information\\shp\\TRC_DCDB.shp']' returned non-zero exit status 1
[Dbg]>>> 

UPDATED

        if ext=='shp':
            subprocess.call(["ogr2ogr", "-f", "ESRI Shapefile", "-clipsrc", clip_polygon, fcOut,  fcIn])

gives me the following error.

>>> 
Working in: C:\Program Files (x86)\MapInfo\Offline Dataset\MapInfo_Tables\Property Information
C:\Program Files (x86)\MapInfo\Offline Dataset\MapInfo_Tables\Property Information\shp
['TRC_DCDB.dbf', 'TRC_DCDB.prj', 'TRC_DCDB.sbn', 'TRC_DCDB.sbx', 'TRC_DCDB.shp', 'TRC_DCDB.shx', 'TRC_DCDB_Raw.dbf', 'TRC_DCDB_Raw.prj', 'TRC_DCDB_Raw.sbn', 'TRC_DCDB_Raw.sbx', 'TRC_DCDB_Raw.shp', 'TRC_DCDB_Raw.shx']
Unexpected error: <type 'exceptions.WindowsError'>
extension not shp
Unexpected error: <type 'exceptions.WindowsError'>
extension not shp
Unexpected error: <type 'exceptions.WindowsError'>
extension not shp
Unexpected error: <type 'exceptions.WindowsError'>
extension not shp
Unexpected error: <type 'exceptions.WindowsError'>
Traceback (most recent call last):
  File "C:\junk\Clip_And_Copy_Files.py", line 45, in <module>
    subprocess.call(["ogr2ogr", "-f", "ESRI Shapefile", "-clipsrc", clip_polygon, fcOut,  fcIn])
  File "C:\Python27\ArcGIS10.1\lib\subprocess.py", line 493, in call
    return Popen(*popenargs, **kwargs).wait()
  File "C:\Python27\ArcGIS10.1\lib\subprocess.py", line 679, in __init__
    errread, errwrite)
  File "C:\Python27\ArcGIS10.1\lib\subprocess.py", line 893, in _execute_child
    startupinfo)
WindowsError: [Error 2] The system cannot find the file specified
>>> 

From FWTools ogr2ogr in Python

I tried

>>> command = 'ogr2ogr -f "ESRI Shapefile" ' + fcIn + fcOut
>>> os.system
<built-in function system>
>>> os.system(command)
1
>>> os.getcwd()
'C:\\junk'
>>> fcOut
'C:\\junk\\Program Files (x86)\\MapInfo\\Offline Dataset\\MapInfo_Tables\\Property Information\\shp\\TRC_DCDB.shp'
>>> fcIn
'C:\\Program Files (x86)\\MapInfo\\Offline Dataset\\MapInfo_Tables\\Property Information\\shp\\TRC_DCDB.shp'
>>> 

but again I get no output..

Best Answer

In general, don't use shell=True (especially with lists, see Getting output from Python script in Python tests

import subprocess
for subdir, dirs, files in os.walk(folder):
    for file in files:
       if file.endswith(".shp"):
           fileok=  os.path.abspath(os.path.join(subdir,file))      
           subprocess.call(["ogr2ogr", "-f", "ESRI Shapefile", "-clipsrc", "clip.shp",  "res2.shp", fileok])

or

for file in files:
    if file.endswith(".shp"):
         fileok=  os.path.abspath(os.path.join(subdir,file))    
         command = ["ogr2ogr", "-f", "ESRI Shapefile", "-clipsrc", "clip.shp",  "res3.shp", fileok]     
         subprocess.Popen(command, stdout=subprocess.PIPE, stdin=subprocess.PIPE,stderr=subprocess.STDOUT) 

works