This is more of a few tips than an answer. You mention that it takes a while to run. I have to things to point out that can save some you some time.
1) Your code goes through the wells once; no need to do it twice. You can get the list of well owner's names from the well_owners.values(). That's a python list of all the values in the dictionary. In your case you would get a list of string values. Thus we can chop:
# Create search cursor which will iterate through wells
with arcpy.da.SearchCursor(wellsFL, ["OBJECTID"]) as cursor:
for row in cursor:
# set well owner name for this record
record = row[0]
owner = well_owners[record]
matches[record] = []
# create search cursor which will iterate through orps owners
with arcpy.da.SearchCursor(orpsFL, ["OWNER_L", "OBJECTID"]) as orpscur:
for row in orpscur:
# set orps owner name
orpsowner = row[0]
ID = row[1]
# compare owner's names
if owner != orpsowner:
pass
else:
x = x + 1
matches[record].append(ID)
to the following. This method removes iterating through the orpsFL with each well.
# create search cursor which will iterate through orps owners
with arcpy.da.SearchCursor(orpsFL, ["OWNER_L", "OBJECTID"]) as orpscur:
for row in orpscur:
# set orps owner name
orpsowner = row[0]
ID = row[1]
# see if orps owner is in list of well owner
if orpsowner in well_names.values():
x += 1
matches[orpsowner].append(ID)
2) SQL has an IN operator that is very useful. You are adding to the selection for each matching OID. It's faster to say "OBJECTID IN (1,2,3,4,5)" than selecting 1, then 2, then 3, etc. So we can change:
for key in matches:
arcpy.SelectLayerByAttribute_management(wellsFL, "NEW_SELECTION", "OBJECTID = {}".format(key))
for element in matches[key]:
nextmatch = matches[key].pop()
arcpy.SelectLayerByAttribute_management(orpsFL, "ADD_TO_SELECTION", "OBJECTID = {}".format(nextmatch))
to:
for key in matches:
arcpy.SelectLayerByAttribute_management(wellsFL, "NEW_SELECTION", "OBJECTID = {}".format(key))
# tuple of ids to select
id_select = tuple(matches[key])
arcpy.SelectLayerByAttribute_management(orpsFL, "NEW_SELECTION", "OBJECTID in {}".format(id_select))
I hope I didn't change any variable names and that you find this helpful.
Debugging your query
assembly math with a fixed item
of 'Named Road'
using:
parts = query.split('Or')
print("Expression members:\n\t{:s}".format('OR\n\t'.join(parts)))
I got output:
"ST_NAME1 = 'Named Road' OR
Remark1 = 'Named Road' OR
ST_NAME2 = 'Named Road' OR
Remark2 = 'Named Road' OR
ST_NAME3 = 'Named Road' OR
Remark3 = 'Named Road' OR
ST_NAME4 = 'Named Road' OR
Remark4 = 'Named Road' OR
ST_NAME5 = 'Named Road' OR
Remark5 = 'Named Road' OR
ST_NAME6 = 'Named Road' OR
Remark6 = 'Named Road' OR
ST_NAME7 = 'Named Road' OR
Remark7 = 'Named Road' OR
ST_NAME8 = 'Named Road' OR
Remark8 = 'Named Road' OR
ST_NAME9 = 'Named Road' OR
Remark9 = 'Named Road'"
So it appears that you put double-quotes around the whole expression, making it one huge Boolean field name (hence the syntax error).
But there's lots of simplification opportunities here, exploiting the use of both str.format()
and str.join()
:
fields = []
for i in range(1,10):
fields.append("ST_NAME{:d}".format(i))
fields.append("Remark{:d}".format(i))
for item in street_names:
terms = []
for field in fields:
terms.append("{:s} = '{:s}'".format(field,item))
query = ' OR '.join(terms)
in_data = arcpy.management.SelectLayerByAttribute(
self.street, "ADD_TO_SELECTION", query , "NON_INVERT")
Not only is this cleaner, you also don't need to debug string math.
It will run slower (albeit not measurably slower, in the context of layer selection). I benchmarked the two different query
assembly algorithms, and the .join
added one millisecond to overall execution across all 104 street names).
So then I tried a list comprehension in terms assembly:
for item in street_names:
terms = ["{:s} = '{:s}'".format(field,item) for field in fields]
query = ' OR '.join(terms)
and was able to cut the cost to 0.5 milliseconds (measurement averaged across 500 iterations).
If you want double quotes around each field name (permitted, but not required for file geodatabase), then the list comprehension gets six extra characters:
for item in street_names:
terms = [""""{:s}" = '{:s}'""".format(field,item) for field in fields]
query = ' OR '.join(terms)
Best Answer
See this post: Using "IN" for SQL statement in Python
Your current code returns this for the expression: "Field = 'XXXX' "
You need this in your expression (assuming your using shapefiles): "Field"= 'XXXX'
For GDB feature class, you need this: Field = 'XXXX'
You also need a feature layer for the selection.
Try this instead of your code: