I am using ArcGIS 'watershed' routine in a script which makes use of multiprocessing.
This works fine, but during execution I get the message:
Unable to remove directory. Possible causes:
1- Not owner of the directory
2- Another person or application is accessing this directory
EDIT 27/02/14:
I've incorporated the suggestions of using arcpy.Exists
(with a very rudimentary way of checking if something didnt exist) and writing all results to disk:
import arcpy
from arcpy import sa
arcpy.CheckOutExtension("Spatial")
def watershed(pnts, flowdir, flowacc):
direc = tempfile.mkdtemp(dir = "C:\\temp") #Create a separate directory for files to be written to.
arcpy.env.scratchWorkspace = direc
arcpy.env.workspace = direc
res = []
for i, p in enumerate(pnts)
pnt = arcpy.PointGeometry(arcpy.Point(p.x, p.y, ID = i)) # Convert the Shapely point to arcpy point
pourpt = sa.SnapPourPoint(pnt, flowacc, 10000) #Snap point to high flow acc cell
if arcpy.Exists(pourpt):
ws = sa.Watershed(flowdir, pourpt) # Calculate watershed
if arcpy.Exists(ws):
out = os.path.join(direc, "poly_%i"%i)
poly = arcpy.RasterToPolygon_conversion(ws, out) #Convert to polygon
res.append(poly[0]) # Put the polygon in the results list for this set of points
else polylist = "NoWS"
else polylist = "NoPourPt"
return res
The parallel function remains the same:
def watershed_pll(mod, proc=6):
"""
Calculate the watershed for each station point using parallel processing
All relevant data is held in the mod object
"""
pool = Pool(processes = proc)
# Iterate over each feature of the drainage path and submit the list of pour points to the worker function
jobs = []
for key, val in mod.stations.geometry.iteritems():
jobs.append((key, pool.apply_async(watershed, (val.points, key, mod.flowdir,
mod.flowacc, rdict))))
pool.close()
pool.join()
return jobs
Sometimes it runs perfectly, other times I get one of various errors such as the initial one above, or also:
FATAL ERROR (INFADI)
MISSING DIRECTORY
and:
ERROR 010088: Invalid input geodataset (Layer, Tin, etc.).]
and:
ERROR 010050: Cell size is not set.
Which is odd as I explicitly set the cell size within the function…
When it works and when it doesnt seems to be hit and miss/random. Such errors lead me to believe that arcpy is still trying to delete stuff behind the scenes. Is there any way to explicitly prevent it from doing so?
I wonder if my problem could be related to one of two things:
-
Environments/folders getting mixed up. The input data is passed as complete filepaths to some raster datasets which are outside of the scratchWorkspace (which is set locally for each process) where intermediate/output data is created. However, I have noticed that Arc may make folders (typically and 'info' folder) in the directories of the input data. Why is that? Can it be prevented, or can I prevent Arc from then trying to delete it?
-
Are there any potential problems with accessing the input raster data sets at the same time? I.e each process will be attempting to open and read from the flow direction and flow accumulation rasters which are passed into the function.
Best Answer
Some general steps for debugging this kind of problem...