ArcGIS – Resolving UnpickleableError in arcpy with Multiprocessing for Efficient Parallel Processing

I'm attempting to speed up a process which is currently running synchronously, by using the python multiprocessing module.

I'm having trouble sending a feature layer to a function which is called by multiprocessing, as demonstrated in this simple script:

import multiprocessing, arcpy

def doProcess(lyr):
    print(lyr.name)

if __name__ == '__main__':

    #Create an array of feature layers
    arcpy.env.workspace = "C:\Program Files (x86)\ArcGIS\Desktop10.2\TemplateData\TemplateData.gdb"
    featureLayers = []
    fcs = arcpy.ListFeatureClasses("*","All","World")
    for fc in fcs:
        arcpy.Delete_management(fc + "_lyr")
        lyrName = fc + "_lyr"
        arcpy.MakeFeatureLayer_management(fc, lyrName)
        featureLayers.append(arcpy.mapping.Layer(lyrName))

    #This works when not using multiprocessing:
    for featureLayer in featureLayers:
        doProcess(featureLayer)

    #This fails with "UnpickleableError: Cannot pickle <type 'geoprocessing Layer object'> objects"
    pool = multiprocessing.Pool()
    pool.map(doProcess, featureLayers)
    pool.close()
    pool.join()

When iterating over the array manually, rather than using multiprocessing, the function has access to the feature layer. But when using multiprocessing, this error message is shown:

UnpickleableError: Cannot pickle type 'geoprocessing Layer object'
objects

What is the correct syntax/approach to handle a feature layer within the multiprocessing environment? I based the above script on the example on the Esri blog Multiprocessing with ArcGIS

import multiprocessing, arcpy, os def doProcess(fClass): #This function doesn't do anything, it's just to show that accessing arcpy methods is possible print("in do process function for " + fClass) arcpy.env.workspace = "C:\Program Files (x86)\ArcGIS\Desktop10.2\TemplateData\TemplateData.gdb" arcpy.Delete_management(fClass + "_lyr") lyrName = fClass + "_lyr" arcpy.MakeFeatureLayer_management(fClass, lyrName) desc = arcpy.Describe(lyrName) print("Finished " + desc.Name) if __name__ == '__main__': #Create an array of feature class names arcpy.env.workspace = "C:\Program Files (x86)\ArcGIS\Desktop10.2\TemplateData\TemplateData.gdb" fClasses = [] fcs = arcpy.ListFeatureClasses("*","All","World") for fc in fcs: fClasses.append(fc) #Multiprocessing approach pool = multiprocessing.Pool() pool.map(doProcess, fClasses) pool.close() pool.join()

Best Answer

I finally found the time to look into this. I don't fully understand the "unpickleable" error message, but a workaround is to pass only strings into the multiprocessor. Something like this:

(Interestingly, this script takes a lot longer to complete when I use the multiprocessing approach, compared to just running:

for fClass in fClasses:
    doProcess(fClass)

Presumably there's a lot more overhead in setting up the environments for each thread. Hopefully in a more complicated scenario involving long geoprocessing tasks, the payoff would be faster overall completion of all tasks.)

Best Answer

Related Solutions

[GIS] Multiprocessing issues with ArcPy

[GIS] TypeError: unsupported operand type(s) for *: ‘float’ and ‘geoprocessing Layer object’

Related Question