[GIS] Unique Values from Multiple Fields using Arcpy and Numpy

numpypython

I am using Python 2.7 and am trying to generate a list of unique values from all values within two fields (cropsum and cropsum2). I am not trying to find unique row combinations. I've tried using both methods from http://geospatialtraining.com/get-a-list-of-unique-attribute-values-using-arcpy/ but neither method is working for me using two fields. My preference is to use numpy as and I am trying to write a script to manipulate data from a Frequency tool dbf output.

Method 1: Search Cursor

in_table = 'frequencySoutheastSubset.dbf'
field_names = ['cropsum', 'cropsum2']
with arcpy.da.SearchCursor(in_table, [field_names]) as cursor:
    uniqueCropCodes = sorted({row[0] for row in cursor})                   
    print uniqueCropCodes

returns the following error:
TypeError: 'field_names' must be string or non empty sequence of strings

Method 2: Numpy

arr = arcpy.da.TableToNumPyArray('frequencyNorthSubset.dbf',['cropsum', 'cropsum2'])
print(arr)

uniqueCropCodes = numpy.unique(arr)
print uniqueCropCodes

Returns unique row combinations, which is not what I want.
The numpy.unique documentation example

>>> a = np.array([[1, 1], [2, 3]])
>>> np.unique(a)
array([1, 2, 3])

suggests that the code above should work.

I tried suggestions from this np.unique(arr[['cropsum', 'cropsum2']].values)
did not work either: AttributeError: 'numpy.ndarray' object has no attribute 'values'

My dbf in case it's helpful:enter image description here

Best Answer

Use a set:

A set is an unordered collection with no duplicate elements

import arcpy
in_table = 'frequencySoutheastSubset.dbf'
field_names = ['cropsum', 'cropsum2']

all_values = []

with arcpy.da.SearchCursor(in_table,field_names) as cursor:
    for row in cursor:
        all_values.extend(list(row))

unique_values = set(all_values)

print unique_values

This should also work:

from itertools import chain

unique_values = set(list(chain(*[i for i in arcpy.da.SearchCursor(in_table, field_names)]))