[GIS] Selecting certain fields in layer using ArcPy

arcpyselect-by-attribute

I have a layer in geodatabase, and I want to use SelectLayerByAttribute to select certain fields in this layer.

Then, use CopyFeature to output those fields to a new shapefile.

for example:

layer = I created in gdb

selection_type = 'NEW_SELECTION'

SelectedFields = """ Field_name = 'XXXX' """

arcpy.SelectLayerByAttribute_management(layer, selection_type, SelectedFields)

But, my where_clause part always invalid.

anyone can help me get the correct SQL expression?

Best Answer

See this post: Using "IN" for SQL statement in Python

Your current code returns this for the expression: "Field = 'XXXX' "

You need this in your expression (assuming your using shapefiles): "Field"= 'XXXX'

For GDB feature class, you need this: Field = 'XXXX'

You also need a feature layer for the selection.

Try this instead of your code:

layer = I created in gdb

arcpy.MakeFeatureLayer_management(layer,"layer")

selection_type = 'NEW_SELECTION'

SelectedField = '\"Field\"' #for shapefile.  If gdb change to "Field"

Expression = SelectedField + '=' + "\'XXXX\'"

arcpy.SelectLayerByAttribute_management("layer", selection_type, Expression)

Related Solutions

[GIS] Select layer by attribute for entries in a dictionary (python)

This is more of a few tips than an answer. You mention that it takes a while to run. I have to things to point out that can save some you some time.

1) Your code goes through the wells once; no need to do it twice. You can get the list of well owner's names from the well_owners.values(). That's a python list of all the values in the dictionary. In your case you would get a list of string values. Thus we can chop:

# Create search cursor which will iterate through wells       
with arcpy.da.SearchCursor(wellsFL, ["OBJECTID"]) as cursor:
        for row in cursor:

            # set well owner name for this record
            record = row[0]
            owner = well_owners[record]
            matches[record] = []


            # create search cursor which will iterate through orps owners
            with arcpy.da.SearchCursor(orpsFL, ["OWNER_L", "OBJECTID"]) as orpscur:
                for row in orpscur:
                    # set orps owner name
                    orpsowner = row[0]
                    ID = row[1] 

                    # compare owner's names    
                    if owner != orpsowner:
                        pass


                    else:
                        x = x + 1
                        matches[record].append(ID)

to the following. This method removes iterating through the orpsFL with each well.

# create search cursor which will iterate through orps owners
with arcpy.da.SearchCursor(orpsFL, ["OWNER_L", "OBJECTID"]) as orpscur:
    for row in orpscur:
        # set orps owner name
        orpsowner = row[0]
        ID = row[1] 
        # see if orps owner is in list of well owner
        if orpsowner in well_names.values():
            x += 1
            matches[orpsowner].append(ID)

2) SQL has an IN operator that is very useful. You are adding to the selection for each matching OID. It's faster to say "OBJECTID IN (1,2,3,4,5)" than selecting 1, then 2, then 3, etc. So we can change:

for key in matches:
    arcpy.SelectLayerByAttribute_management(wellsFL, "NEW_SELECTION", "OBJECTID =    {}".format(key))
    for element in matches[key]:
        nextmatch = matches[key].pop()
        arcpy.SelectLayerByAttribute_management(orpsFL, "ADD_TO_SELECTION", "OBJECTID = {}".format(nextmatch))

to:

for key in matches:
    arcpy.SelectLayerByAttribute_management(wellsFL, "NEW_SELECTION", "OBJECTID =    {}".format(key))
    # tuple of ids to select
    id_select = tuple(matches[key])
    arcpy.SelectLayerByAttribute_management(orpsFL, "NEW_SELECTION", "OBJECTID in {}".format(id_select))

I hope I didn't change any variable names and that you find this helpful.

ArcGIS Pro – Select Layer by Attribute with Multiple Field Attribute Values Using ArcPy

Debugging your query assembly math with a fixed item of 'Named Road' using:

parts = query.split('Or')
print("Expression members:\n\t{:s}".format('OR\n\t'.join(parts)))

I got output:

    "ST_NAME1 = 'Named Road' OR
     Remark1 = 'Named Road' OR
     ST_NAME2 = 'Named Road' OR
     Remark2 = 'Named Road' OR
     ST_NAME3 = 'Named Road' OR
     Remark3 = 'Named Road' OR
     ST_NAME4 = 'Named Road' OR
     Remark4 = 'Named Road' OR
     ST_NAME5 = 'Named Road' OR
     Remark5 = 'Named Road' OR
     ST_NAME6 = 'Named Road' OR
     Remark6 = 'Named Road' OR
     ST_NAME7 = 'Named Road' OR
     Remark7 = 'Named Road' OR
     ST_NAME8 = 'Named Road' OR
     Remark8 = 'Named Road' OR
     ST_NAME9 = 'Named Road' OR
     Remark9 = 'Named Road'"

So it appears that you put double-quotes around the whole expression, making it one huge Boolean field name (hence the syntax error).

But there's lots of simplification opportunities here, exploiting the use of both str.format() and str.join():


    fields = []
    for i in range(1,10):
        fields.append("ST_NAME{:d}".format(i))
        fields.append("Remark{:d}".format(i))
    
    for item in street_names:
        terms = []
        for field in fields:
            terms.append("{:s} = '{:s}'".format(field,item))
        query = ' OR '.join(terms)
            
        in_data = arcpy.management.SelectLayerByAttribute(
                self.street, "ADD_TO_SELECTION", query , "NON_INVERT")

Not only is this cleaner, you also don't need to debug string math.

It will run slower (albeit not measurably slower, in the context of layer selection). I benchmarked the two different query assembly algorithms, and the .join added one millisecond to overall execution across all 104 street names). So then I tried a list comprehension in terms assembly:

    for item in street_names:
        terms = ["{:s} = '{:s}'".format(field,item) for field in fields]
        query = ' OR '.join(terms)

and was able to cut the cost to 0.5 milliseconds (measurement averaged across 500 iterations).

If you want double quotes around each field name (permitted, but not required for file geodatabase), then the list comprehension gets six extra characters:

    for item in street_names:
        terms = [""""{:s}" = '{:s}'""".format(field,item) for field in fields]
        query = ' OR '.join(terms)

Best Answer

Related Solutions

[GIS] Select layer by attribute for entries in a dictionary (python)

ArcGIS Pro – Select Layer by Attribute with Multiple Field Attribute Values Using ArcPy

Related Question