[GIS] Importing ArcPy taking many seconds and code is really slow for GeoProcessing Tools 64-bit (ArcGIS 10.1 SP1)

64bitarcgis-10.1arcpygeoprocessingpython

I've installed ArcGIS10.1 SP1 and whenever I import arcpy (on Pywin console or Eclipse PyDev) it takes aprox. 15-20seconds. Is this normal? I don't think so…

I had Python 2.7.3 64-bit previously installed on my Windows 7 (and still use it) and now I also have Python 32-bit and Python 64-bit that came with ArcGIS.
I think I might have the paths confused, because the 64bit version of Python 2.7 in the folder C:\Python27\ArcGISx6410.1was supposed to be 2.7.2, but when I check the version, inside that path, it gives me:

C:\Python27\ArcGISx6410.1>python.exe
Python 2.7.3 (default, Apr 10 2012, 23:24:47) [MSC v.1500 64 bit (AMD64)] on win
32

My %PATH% has this, among other paths:

...
C:\Python27
C:\Python27\Scripts
C:\Program Files (x86)\ArcGIS\Desktop10.1\Bin
C:\Program Files (x86)\ArcGIS\EsriProductionMapping\Desktop10.1\Bin
...

And I didn't need to set PYTHONPATH. My ArcMap Python shell works fine and so does the 64-bit version for the backgorund scripting.

My code is very simple: basically I have to find how many times does a node (NO) from a node feature class appears in other line feature class. It works, but for #6290 nodes, it takes more than 10min!

'''

# Import system modules
import arcpy, sys
from arcpy import env

print "Running against: {}".format(sys.version)

def do_search_by_cursor(table,searchString):
    count = 0
    count += cursor_search_count(table,searchString)    
    #print "done by cursor: " + str(count)
    return count

def cursor_search_count(table,searchstring):
    rows = arcpy.SearchCursor(table,searchstring)
    count = 0
    for row in rows:
        count += 1
    return count

# Set the workspace environment to local file geodatabase
env.workspace = "S:/workspace/MSc_git/ArcGis/estagio_db.gdb"

# Set local variables
fc_nodes = "Rede_Cbr_MM_node_proj"
fc_links = "Rede_Cbr_MM_2008_WithoutMetro"
# fieldName1 = "LEGS"
# fieldAlias = "numLegs"

fieldname="TONODENO"
# Create field name with the proper delimiters
delimitedfield = arcpy.AddFieldDelimiters(fc_links, fieldname)

# Create search cursor:
rows = arcpy.SearchCursor(fc_nodes, "", "", fields="NO")

for row in rows:
    searchStr= delimitedfield +" = " +str(row.NO)
    myvar= do_search_by_cursor(fc_links, searchStr)
    print "Node %d has %d legs" %(row.NO,myvar)

My PyDev is configured to use the specific Python Interpreter that came with ArcGIS processing Tools 64-bit and looks like this (I've checked all…):
PyDev configuration for the Interpreter

If you could give a hint or a link (I've already explored, but nothing seems to work), I'll appreciate it.

Best Answer

I also create ArcPy geoprocessing tools using PyDev for Eclipse. Generally, I run my scripts as in-process geoprocessing tools in a custom toolbox (.tbx) from within ArcMap so that the import arcpy happens instantly. If I run a script from outside of ArcMap, the import arcpy step usually takes about 5 seconds on a Sandy Bridge Core i5.

I think your trouble may lie in the use of 64-bit Python and the ArcGIS 64-bit background geoprocessing package. I don't know why this would take so much longer to import compared to 32-bit ArcPy. If you want confirmation, you could iteratively add/remove Python libraries from the Python Interpreters menu you have and see if a specific folder is causing the excessive import times.

Personally, I haven't found a compelling need to use 64-bit geoprocessing. Even in some of the massive Spatial Analyst workflows I've had to implement, geoprocessing usually crashes before I hit memory limits. I prefer to divide my inputs before processing and merge them together at the end. This also ensures better compatibility with 3rd part modules, most of which are already compiled for 32-bit Python.


I also think there's a bunch of tweaks you can make to your code to make it faster, besides the import arcpy issue and 64- vs. 32-bit Python. Creating a new searchCursor object for each iteration of your fc_nodes features is really slow. This means you are creating 6290 * 6290 = 39,564,100 search cursor objects. Creating each cursor object has processing overhead, so it's not surprising that this is taking many minutes to run. You want to avoid this code structure with nested cursors.

It appears that you are trying to find the counts of features in fc_links with the value of the TONODENO field equal to each possible value of the NO field in fc_nodes. Instead of using these nested cursors, it would be much easier to use the arcpy.Statistics_analysis() tool, with the TONODENO as the case field:

fc_links = "Rede_Cbr_MM_2008_WithoutMetro"
fieldname = "TONODENO"
statisticsTable = "OutputStatisticsTable"
arcpy.Statistics_analysis(fc_links, statisticsTable, [[fieldname,"COUNT"]], fieldname)

This is similar to do a Group By query in SQL, or a Totals query in Excel. In the output table, the FREQUENCY column gives the number of features in fc_links that have that value for the TONODENO field. You can join this back table to the fc_nodes table, or store it in a dictionary or something like that.

Related Question