ArcPy Cursor – Updating First Occurrence of Duplicate with Lowest Value in ArcGIS Pro

arcgis-proarcpyattribute-tablecursorgps

I have a few hundred GPS data points (saved in an ArcGIS Pro 2.9 file geodatabase) that logged simultaneously with different speeds. I am looking to write a Python script that will compare each row's DateTime field, and if they are identical, update the first occurrence with the lowest speed (int field) logged.

INPUT:

|OID|        DateTime     |Speed|
| 1 | 12/11/2021 16:41:21 | 14 |
| 2 | 12/11/2021 16:41:21 | 12 |
| 3 | 12/11/2021 16:41:32 | 20 |
| 4 | 12/11/2021 16:41:32 | 25 |

TARGET:

|OID|        Time         |Speed|
| 1 | 12/11/2021 16:41:21 | 12 |
| 2 | 12/11/2021 16:41:32 | 20 |

Below is the code I used to delete exact duplicates:

check_fields = ['OBJECTID', 'SpeedDate' # concatenated speed and date fields]
found_values = defaultdict(set)


with arcpy.da.UpdateCursor(input_dataset, check_fields) as cursor:

    for row in cursor:
        for i, column in enumerate(check_fields):
            if row[i] in found_values[column]:
                cursor.deleteRow()
                break
        else:
            for column, value in zip(check_fields, row):
                found_values[column].add(value)

I'm pretty new to using UpdateCursor and ArcPy in general. I assume the first part of this code will still be useful in creating a set of unique values.

How do I go about assigning the first occurrence a value based on a second row comparison?

Best Answer

Create a list of tuples using da.SearchCursor. Sort the list by date and speed, so the lowest speed for each date comes last. Create a dictionary of datetime:lowest speed. Create a dictionary of OID:lowest speed, update.

import arcpy

fc = r'C:\GIS\ArcMap_default_folder\Default.gdb\datetime_speed'
fieldlist = ['OID@','DateTime','Speed']

all_rows = [row for row in arcpy.da.SearchCursor(in_table=fc, field_names=fieldlist)]
#[(1, datetime.datetime(2010, 4, 24, 0, 0), 0.22), (2, datetime.datetime(2010, 4, 24, 0, 0), 0.9), (13, datetime.datetime(2010, 4, 19, 0, 0), 0.21), (14, datetime.datetime(2010, 4, 19, 0, 0), 0.06), (16, datetime.datetime(2010, 4, 19, 0, 0), 1.0)]

all_rows.sort(key=lambda x: (x[1], x[2]), reverse=True)
#The lowest speed is last for each date:
#[(2, datetime.datetime(2010, 4, 24, 0, 0), 0.9), (1, datetime.datetime(2010, 4, 24, 0, 0), 0.22), (16, datetime.datetime(2010, 4, 19, 0, 0), 1.0), (13, datetime.datetime(2010, 4, 19, 0, 0), 0.21), (14, datetime.datetime(2010, 4, 19, 0, 0), 0.06)]

tempdict = {row[1]:row[2] for row in all_rows}
#Store each date as key and lowest speed in dictionary:
#{datetime.datetime(2010, 4, 24, 0, 0): 0.22, datetime.datetime(2010, 4, 19, 0, 0): 0.06}

#Create a dictionary of OID:lowest speed
updatedict = {row[0]:tempdict[row[1]] for row in all_rows}
#{2: 0.22, 1: 0.22, 16: 0.06, 13: 0.06, 14: 0.06}

#Update the feature class. I store the lowest speeds in a new field
with arcpy.da.UpdateCursor(fc, ['OID@','lowest_speed']) as cursor:
    for row in cursor:
        if row[0] in updatedict:
            row[1] = updatedict[row[0]]
            cursor.updateRow(row)

enter image description here

Related Question