[GIS] Auto-incrementing attribute fields using ArcPy cursors and conditional statements

arcgis-10.1arcpycursorpython-2.7

I am trying to create unique IDs for an attribute named "FACILITYID" (str) and have got a ton of help from related posts and online python tutorials. I selected the rows I wished to create unique ID's for, and used this autoIncrement() function in the field calculator which works fine.

def autoIncrement(pStart=1,pInterval=1): 
    global rec
    if (rec == 0):  
        rec = pStart  
        return rec
    else:
        rec = rec + pInterval
        return rec

However, I would like to keep track of the variable rec so that I can make new unique IDs later. After concluding that I must move out of the field calculator, I have attempted to use cursors instead.

To update the maximum ID value, I created a data table in the environment with a single attribute "FACILITYID" that can be updated when the code runs. I have had success with this except for one small detail: I only want to update rows that do not already have a FacilityID. Specifically, I only want the code to work for rows in with the FACILITYID is equal to one space character.

FACILITYID == ' '

I have tried countless if statements, while loops, the getValue() function, and more, but can't seem to get it right. Can anyone help me make this code work for my target attribute fields?

#import modules
import os, sys
import arcpy

#Set variables
rec = 0
workspace = "Database Connections\FE.sde"
arcpy.env.workspace = workspace
feature_name = "FE.SManhole" #change feature class if required
field = "FACILITYID"
table_name = "FE.SWR_FacilityID"

tRows = arcpy.SearchCursor(table_name)
tRow = tRows.next()

#Pull in the current highest value from Facility_ID table
maxID = int(tRow.getValue(field))

# define automatic increment function
def autoIncrement(pStart=maxID,pInterval=1):
    global rec
    if (rec == 0):  
        rec = pStart  
        return rec
    else:
        rec = rec + pInterval
        return rec

#loop through table to update FacilityID if needed.
cursor = arcpy.UpdateCursor(feature_name) #There is no guarantee of order here
for row in cursor:
#####location where I have been trying if statements.####
    row.setValue(field, str(autoIncrement()))
    cursor.updateRow(row)

#save new rec value to FE.SWR_FacilityID table
maxID = rec
tRows = arcpy.UpdateCursor(table_name)
tRow = tRows.next()
tRow.setValue(field, str(maxID))
tRows.updateRow(tRow)

del tRows, tRow, cursor, row
print 'Highest "FACILITYID" Value: ' + str(maxID)

I am very new to python and I got a majority of this code from previous posts including Auto-incrementing in Python Script with ArcPy cursor?

Best Answer

The function you're using to auto increment IDs is meant for use in the field calculator; in the field calculator you have to define the global variable to be used by other rows in the table, otherwise all your values would be 1. For your purposes, you necessarily need to define rec as a global variable, and you don't need to create a function to add 1 (just use rec+=1).

Problem: You have a table of facilities, some facilities have IDs already and others do not. The ID is a numeric value that is stored in a text field, so you need to find the largest ID number that is already assigned, and keep track of the largest ID your script assigned. I'm assuming you don't want to auto-assign an ID that is already assigned to another row (making your field a primary key).

With an arcpy.da cursor you can have a query (where clause) built into the cursor, so that you're only iterating through rows that satisfy the conditions of the query (http://desktop.arcgis.com/en/arcmap/10.3/analyze/arcpy-data-access/updatecursor-class.htm) . Try this:

import os, sys
import arcpy

#Set variables
rec = 0
workspace = "Database Connections\FE.sde"
arcpy.env.workspace = workspace
feature_name = "FE.SManhole" #change feature class if required
field = ["FACILITYID"]
table_name = "FE.SWR_FacilityID"

### Create a list of all facility IDs that are already populated
### convert every item to integer, if the item is not ' ' or '' or None
###### I'm assuming your ids are integers, but you may have to change this
all_facility_ids = [int(x[0]) for x in arcpy.da.SearchCursor(feature_name, field) if x[0] not in [' ', '', None]]

### Sort the list of IDs (smallest to largest) and return the last item
max_id = sorted(all_facility_ids)[-1]


#loop through data with arcpy.da.UpdateCursor (way faster)
# arcpy.da cursors have the option to use a query, so we'll do that
query = "FACILITYID not in (' ', '', NULL)"
with arcpy.da.UpdateCursor(feature_name, field, query) as cursor:
    for row in cursor:
        while True: ## This part will keep adding to rec if the number is already in the list of facility IDs
            if rec in all_facility_ids:
                rec +=1
            else:
                rec +=1
                break
        row[0] = str(rec)
        cursor.updateRow(row)
del cursor

Here's another option, if you want to use an if statement to iterate through all the rows:

with arcpy.da.UpdateCursor(feature_name, field) as cursor:
    for row in cursor:
        if row[0] not in (' ', '', None):
            while True: ## This part will keep adding to rec if the number is already in the list of facility IDs
                if rec in all_facility_ids:
                    rec +=1
                else:
                    rec +=1
                    break
            row[0] = str(rec)
            cursor.updateRow(row)
        else: continue
del cursor
Related Question