[GIS] Deleting duplicated features with ArcPy

I try to use this python code in order to delete duplicated feature which as the same xy coordinate of centroid (i created those fields with Calculated Geometry in the attribute table):

import arcpy
'''
first we create x center coordinate field (the same for y) in the attribute
table manually, then we will run this code.
'''
list1 = []
listToDeleteX = []
listToDeleteY = []
fc = r"G:\desktop\Project\lyr\polygon.shp"
# check the x coordinate
with arcpy.da.UpdateCursor(fc, "xCenter") as cursor:
    for row in cursor:
        list1.append(row)
        if list1.count(row)>1:
            listToDeleteX.append(row)

 # check the y coordinate
with arcpy.da.UpdateCursor(fc, "yCenter") as cursor:
    for row in cursor:
        list1.append(row)
        if list1.count(row)>1:
            listToDeleteY.append(row)

listToDeleteY.append(listToDeleteX)

in the end of the code i added the x list to the y list but i don't know how to delete the duplicated rows.

I work with arcGIS for Desktop so i don't have any extensions and i can't use the "arcpy.DeleteIdentical_management" tool.

This is the attribute table of the polygon layer:

import arcpy def main(): dict = {} # dictionary, key is test value, item is objectID tbl = r"C:\Scratch\fGDB_AIS_Cleaned.gdb\test" # Table to test # Search table adding only the first occurance of a value and it's objectID print "reading dataset..." with arcpy.da.SearchCursor(tbl,["OBJECTID","test"]) as cursor: for row in cursor: objID = row[0] val = row[1] if dict.has_key(val) == False: dict[val] = objID # Get a list of objectIDs to keep oList = dict.values() # Check duplicates if they exist n = int(arcpy.GetCount_management(tbl).getOutput(0)) if n > len(oList): print "deleting duplicates..." # create a sql expression on ObjectID sql = "OBJECTID NOT IN (" + str(oList) + ")" sql = sql.replace("[","") sql = sql.replace("]","") # Delete duplicates arcpy.MakeTableView_management(tbl,"tocleanup") arcpy.SelectLayerByAttribute_management("tocleanup","NEW_SELECTION",sql) arcpy.DeleteRows_management("tocleanup") print "deleted duplicate rows!" arcpy.Delete_management("tocleanup") if __name__ == '__main__': main() print "finished!"

Best Answer

This code will work on a table and searches a numeric field called test to find and delete duplicates. It assumes the first instance of a duplicate value is the one you want to keep.

One thing to consider, I have found deleting from very large datasets (e.g. millions of rows) can be very slow. It is much quicker to copy out rows you want to keep into a new dataset.

Best Answer

Related Solutions

ArcPy Insert Cursor – Not Inserting All Rows Issue

Related Question