I have a feature class where there are multiple values duplicated in a row, i.e. SRNumber and RESOLUTION_CODE. I would like to delete all rows where my SRNumber and RESOLUTION_CODE are found in multiple rows. I have got a good start with this script, everything works aside from actually deleting the rows. How can I fix this so that my duplicate rows are deleted?
So if SRNumber = 1-23456
and Resolution_Code = 'A'
and appears twice, I would like to delete one.
import arcpy
in_table = r"ServiceRequest.DBO.SO_Dead_Animal"
fields = ["NumberCYLA","RESOLUTION_CODE"] #replace with own
curStrings = arcpy.da.SearchCursor(in_table,fields,"NumberCYLA is not null")
stringsKeysList = []
numbersValuesList = []
tableDict = {}
for row in curStrings:
print row[0],row[1]
stringsKeysList.append(row[0])
del curStrings
uniqueStringsList = list(set(stringsKeysList)) #only unique values from Strings field
for uniqueString in uniqueStringsList:
emptyTempValues = []
curValues = arcpy.da.SearchCursor(in_table,fields,"""NumberCYLA = '{0}'""".format(str(uniqueString)))
for row in curValues:
print row[1]
emptyTempValues.append(row[1])
uniqueEmptyTempValues = list(set(emptyTempValues))
tableDict[uniqueString] = uniqueEmptyTempValues
print tableDict #a dictionary with string:values
#{u'String_2': [-4, -2],
#u'String_3': [None, -3],
#u'String_56': [-1],
#u'String_1': [-1],
#u'String78': [-1]}
rowsToDelete = [k for k, v in tableDict.iteritems() if v == [-1]] #find which rows to delete
print "list: ", rowsToDelete
stringRows = ","
updRows = stringRows.join(rowsToDelete)
updRows = updRows.replace(",","','")
print updRows #update the string to be used with proper syntax in where clause
where_clause = """NumberCYLA in ('{0}')""".format(updRows)
upd_cur = arcpy.da.UpdateCursor(in_table,fields,where_clause)
for row in upd_cur:
print row[0],row[1]
del rowsToDelete
Best Answer
If you're wanting to keep one row of each duplicate, make use of Delete Identical. Use your two fields as your field inputs.
arcpy.DeleteIdentical_management (r"c:\example\example.gdb\example", ["SRNumber", "Resolution_Code"])
If you would like to delete all duplicates, determine the duplicates with a little list comprehension, and then use an update cursor to delete your rows.