[GIS] How to transform values using a dictionary or lookup table

arcgis-10.1arcpypython

I'm reading rows (~20 fields per row) from a database using a SearchCursor and placing them into an array. Then, I loop over the array and use an InsertCursor to insert them into a Feature Class in a different database.

The problem, I need to transform field values in the source data. For example, a column may contain the strings "T", "true", "Yes", and "1" and they must be converted to a string value of "TRUE" before being written to the destination column.

There may be multiple values in a source column that need to be mapped to a single value in the destination. This is the example above.
There may be multiple lookups per column. For example, one column may have as source value of "A" that gets transformed to "Z1" and in the same column, "B" gets transformed to "Z2", and still in the same column, "C" gets transformed to "Z1" (multiple source values mapped to same destination value).
There are many columns that will need lookups created. I'd prefer to have a different dictionary / lookup table for each column as there will likely be subtle differences between columns and trying to reuse dictionaries will get frustrating.

How should I approach this problem?

Update #1

Sample using suggestions by @mr.adam: Throws an error on the line if row[key].lower() in lookup(key[1]): with the message TypeError: int object is not subscriptable

lookup = {
    3: ("TRUE", ["t", "true", "1", "yes"]),
    4: ("FALSE", ["f", "false", "0", "no"])
     }

rows = []
rows.append(("abc", "123", "xyz", "True", "F"))
rows.append(("lmo", "456", "xyz", "1", "no"))
rows.append(("tuv", "456", "xyz", "yes", "0"))

for row in rows:
    for key, value in lookup.iteritems():
        if row[key].lower() in lookup(key[1]):
            row[key] = lookup(key[0])

for row in rows:
    print row

Update #2

after some additional digging, breaking down the above referenced line, row[key].lower() evaluates to "true" as expected for column 4 of the first row in the dataset. The error is thrown when evaluating the in clause of that line, lookup(key[1]).

Best Answer

I would make a dictionary that looks something like this:

lookups = {
    "TRUE":["t", "true", "1", "yes"],
    "FALSE":["f", "false", "0", "no"]
     }

Then, the logic could look like:

for row in rows:
    for i in range(0,len(row)):
        for key, value in lookup.iteritems():
            if str(row[i]).lower() in value:
                row[i] = key

That code will update the entire table at once, row by row. This would be a problem if you have field1 where the value "T" should be translated to "TRUE" and field2 where "T" should be translated to "Top". To if that is the case, you could modify the dictionary to:

lookups = {
    fieldindexnumber: ("TRUE",["t", "true", "1", "yes"]),
    fieldindexnumber: ("FALSE",["f", "false", "0", "no"]),
    fieldindexnumber: ("Unknown",["u"]), # make sure the second object in the tuple is a list []
     }

Then just change the looping structure to:

for row in rows:
    for key, value in lookup.iteritems():
        if row[key].lower() in value[1]:
            row[key] = value[0]

Note that I made all of the potential values lowercase and then cast the existing value to lowercase. I've found that to be very helpful a lot of times, but it may not be what you're looking for.

You may want to implement it in a different way, but I would definitely recommend going with some version of this dictionary, because you can just store it at the top of your script and it will be clearly laid out in case you want to change/add key-value pairs.

Related Solutions

ArcGIS – Using If Then Else in ArcGIS Model Builder

If the precondition that checks for True or False is expecting a boolean, it will always evaulate to true because non-empty strings return true values, like so:

>>> if "true":      print "Really True!"

Really True!
>>> if "false":     print "Really True!"

Really True!

Have you tried returning True or False (booleans) instead of "true" or "false" (strings)? That might solve the issue.

def CountMe(n):         
     if n > 0:
      return True
     else:
      return False

The following is also equivalent to the above:

def CountMe(n):         
    return True if n > 0 else False

A few notes:

Importing arcpy (which seems unnecessary) can drastically slow down your code. If you have 1000 rows, you might very well be increasing run time by 15+ minutes.

enter image description here

[GIS] Find index location of a lat/lon point on a raster grid in ArcPy

It's quite simple unless the raster is rotated. You just need to know the x,y coordinates of the point, the pixel size (width and height) and the coordinates of the upper left (xmin,ymax) of the raster.

import arcview,arcpy

def map_to_pixel(point_x, point_y, cellx, celly, xmin, ymax):
    col = int((point_x - xmin) / cellx)
    row = int((point_y - ymax) / -celly)
    return col,row

point_x=your_x_coordinate
point_y=your_y_coordinate

dsc=arcpy.Describe(your_raster)

cellx=dsc.MeanCellHeight
celly=dsc.MeanCellWidth

xmin=dsc.extent.XMin
ymax=dsc.extent.YMax

col,row=map_to_pixel(point_x, point_y, cellx, celly, xmin, ymax)

Pixels are numbered from the top left.

Best Answer

Related Solutions

ArcGIS – Using If Then Else in ArcGIS Model Builder

[GIS] Find index location of a lat/lon point on a raster grid in ArcPy

Related Question