[GIS] Find max and min values from a table based on two fields

arcpyattribute-joinscursor

I have a geodatabase table with the following fields: County_Name, State_Name, Date_Time, Customers

Example

Alameda, CA, 8/1/2014 12AM, 6  
Alameda, CA, 8/1/2014 1AM, 1  
Alameda, CA, 8/1/2014 2AM, 3  
....

Data for all US counties is in the table. I just want to create a new table that contains the

county, state, max customers for the day (24 hour period), time of the max customers
county, state, min customers for the day (24 hour period), time of the min customers

Many states have the same county names as other states.

Do I need to create a cursor to obtain a list of the unique county and state names first?

This will give me a list of the county names but I also need the state names.

with arcpy.da.SearchCursor(tablesMerged, ["county_nam"]) as cursor:
        print sorted({row[0] for rows in cursor})

Best Answer

Since you said you just need a hint with python I came up with this:

My test table:

import arcpy

arcpy.env.workspace = "..." #on windows use \\


#SEARCHCURSOR
#SearchCursor (in_table, field_names, {where_clause}, {spatial_reference}, {explode_to_points}, {sql_clause})"""
stable = "test2.shp"
sfield = ["county", "state", "date", "Customers"]


namesdb = set(row[0] for row in arcpy.da.SearchCursor(stable, sfield)) #a set is a collection type which does not allow duplicates
#since there are a lot of duplicates (countynames) we only want each countyname once
#more information on sets can be found here http://www.python-course.eu/sets_frozensets.php

print "Countynames without duplicates: ", namesdb



"""
I only left that in for better understanding what is basically happening in the generator above
The namesdb = set(row.....) replaces all of this:


names = []
namesdb = []

with arcpy.da.SearchCursor(stable, sfield) as scursor: #search for the county names
    for srow in scursor:
        names.append(srow[0]) #append all found countynames to a list


print "duplicates: ", names


for i in names: #get rid of dublicates in statenames
    if i not in namesdb:
        namesdb.append(i) #create new list without duplicates


print "without duplicates: ", namesdb

"""


countylist = {} #create a dict where elements from namesdb are the KEYS
for name in namesdb:
    countylist[name] = []

print "dict: ", countylist


with arcpy.da.SearchCursor(stable, sfield) as scursor:
    for srow in scursor:
        for county in namesdb: #iterate through the namesdb list
            if srow [0] == county: #get the value of each defined column in sfield for each row of countyname
                    countylist[county].append(srow) #append to the matching KEY in dictionary

        else:
            continue


print "dict with values: ", countylist


from operator import itemgetter
#now the trickiest part (for me, cause I am a python beginner, and that really hurt my head). I went to SO to ask a question. See here for better underatanding whats happening: http://stackoverflow.com/questions/25288927/finding-the-max-value-of-list-in-dictionary?noredirect=1#comment39411443_25288927

maxcostumers = {k: max(v, key=itemgetter(-1)) for k, v in countylist.iteritems()} #returns max number of customers, if customers is the last column in your table. If not, adjust the -1
mincostumers = {k: min(v, key=itemgetter(-1)) for k, v in countylist.iteritems()} #returns the min number of customers, as see above


print "row from table with max costumers: ", maxcostumers
print "row from table with min costumers: ", mincostumers

Thats the Output generated:

ountynames without duplicates:  set([u'New_York', u'Jersy', u'Alameda'])

dict:  {u'New_York': [], u'Jersy': [], u'Alameda': []}

dict with values:  {
u'New_York': [(u'New_York', u'NY', datetime.datetime(2014, 8, 13, 0, 0), 10), (u'New_York', u'NY', datetime.datetime(2014, 8, 13, 0, 0), 4), (u'New_York', u'NY', datetime.datetime(2014, 8, 13, 0, 0), 3)], 
u'Jersy': [(u'Jersy', u'JY', datetime.datetime(2014, 8, 13, 0, 0), 6), (u'Jersy', u'JY', datetime.datetime(2014, 8, 13, 0, 0), 7)], 
u'Alameda': [(u'Alameda', u'CA', datetime.datetime(2014, 8, 13, 0, 0), 1), (u'Alameda', u'CA', datetime.datetime(2014, 8, 13, 0, 0), 2), (u'Alameda', u'CA', datetime.datetime(2014, 8, 13, 0, 0), 3), (u'Alameda', u'CA', datetime.datetime(2014, 8, 13, 0, 0), 1)]
}

row from table with max costumers:  {
u'New_York': (u'New_York', u'NY', datetime.datetime(2014, 8, 13, 0, 0), 10),
u'Jersy': (u'Jersy', u'JY', datetime.datetime(2014, 8, 13, 0, 0), 7), 
u'Alameda': (u'Alameda', u'CA', datetime.datetime(2014, 8, 13, 0, 0), 3)
}

row from table with min costumers:  {
u'New_York': (u'New_York', u'NY', datetime.datetime(2014, 8, 13, 0, 0), 3), 
u'Jersy': (u'Jersy', u'JY', datetime.datetime(2014, 8, 13, 0, 0), 6), 
u'Alameda': (u'Alameda', u'CA', datetime.datetime(2014, 8, 13, 0, 0), 1)
}

Then you only need an UpdateCursor and the formatting for the time. Information regarding this you will find here: http://resources.arcgis.com/en/help/main/10.1/index.html#//018w00000014000000 and http://resources.arcgis.com/en/help/main/10.1/index.html#//001700000046000000

Related Solutions

[GIS] Arcpy.da.SearchCursor to retrieve the min and max measure values from Polyline-M shape

Something like this?

with arcpy.da.SearchCursor(fc, "SHAPE@") as rows:
    for row in rows:
        extent = row[0].extent
        # Just an example of accessing the measures from extent:
        arcpy.AddMessage("Feature measures from {0} to {1}.".format(extent.MMin, extent.MMax))

ArcPy Geoprocessing – How to Find Max Value Among Fields and Print Field Name

Since your answer was tagged with arcgis-10.3, you'll find there is no need to use arcgisscripting. (I don't even know where to start with it to be honest.) You can also use the updated data access module cursors.

There's no need to use zip or any kind of sorting since you are accessing values on a per-row basis. As you mentioned, you can use field calculator to find the max, which would simplify this somewhat since you can just grab the maximum value inside the for loop instead of calculating it as I'm doing:

import arcpy

# input data and fields to find maximum
table = '<path to dataset>'
fields = ["F1","F2","F3","F4","F5"]
n = len(fields)


# Add fields to input table to store maximum and field name
maxfield = "HIGHEST"
maxname = "HIGHEST_name"
arcpy.AddField_management(table, maxfield, "SHORT")
arcpy.AddField_management(table, maxname, "TEXT")

fields2 = fields[:] # shallow copy
fields2.extend([maxfield, maxname])

with arcpy.da.UpdateCursor(table, fields2) as cursor:
    for row in cursor:

        # Look at the first n values        
        check = row[:n]
        maxval = max(check)

        # Get the index position of the maxval and use it slice into field list
        # which gives us the field name
        # only update the last two rows
        row[-2:] = maxval, fields[check.index(maxval)]

        cursor.updateRow(row)

As you can see, some of the maximum values are present in multiple fields. The .index() method only grabs the first occurrence, so you'll need to change a few things to get all the field names, if desired.

Best Answer

Related Solutions

[GIS] Arcpy.da.SearchCursor to retrieve the min and max measure values from Polyline-M shape

ArcPy Geoprocessing – How to Find Max Value Among Fields and Print Field Name

Related Question