I have a few hundred GPS data points (saved in an ArcGIS Pro 2.9 file geodatabase) that logged simultaneously with different speeds. I am looking to write a Python script that will compare each row's DateTime field, and if they are identical, update the first occurrence with the lowest speed (int field) logged.
INPUT:
|OID| DateTime |Speed|
| 1 | 12/11/2021 16:41:21 | 14 |
| 2 | 12/11/2021 16:41:21 | 12 |
| 3 | 12/11/2021 16:41:32 | 20 |
| 4 | 12/11/2021 16:41:32 | 25 |
TARGET:
|OID| Time |Speed|
| 1 | 12/11/2021 16:41:21 | 12 |
| 2 | 12/11/2021 16:41:32 | 20 |
Below is the code I used to delete exact duplicates:
check_fields = ['OBJECTID', 'SpeedDate' # concatenated speed and date fields]
found_values = defaultdict(set)
with arcpy.da.UpdateCursor(input_dataset, check_fields) as cursor:
for row in cursor:
for i, column in enumerate(check_fields):
if row[i] in found_values[column]:
cursor.deleteRow()
break
else:
for column, value in zip(check_fields, row):
found_values[column].add(value)
I'm pretty new to using UpdateCursor and ArcPy in general. I assume the first part of this code will still be useful in creating a set of unique values.
How do I go about assigning the first occurrence a value based on a second row comparison?
Best Answer
Create a list of tuples using da.SearchCursor. Sort the list by date and speed, so the lowest speed for each date comes last. Create a dictionary of datetime:lowest speed. Create a dictionary of OID:lowest speed, update.