ArcGIS – Count Feature Class Records with Column Combinations Matching Another Table

arcgis-desktoparcgis-promodelbuildersql

I have difficulty formulating a good title for this problem.

Problem statement

I have the following 2 datasets, that are included into the same file local GDB. For these tables I want to find the species in a squared area that are key species for a area_type per square. E.g. in square A the area type A and B are found. In Area A the following species are key species (x,y,z), species (x,y,z, q) are found thus for square A we found 3/3 key species using 4 distinct observations (x,y,z,q)

Limitations

I want to use it as a processing step in a Model in ArcGIS Pro. ArcPy, native Python and SQL are all allowed. I am using a FGDB.

Data

  1. A Feature Class representing squared areas where observation are made on species x,y,z. These squared areas belong to one or more area types. As such a following simplified structure is found
ID_square Type_area Species
A 1 x
A 1 y
A 1 z
A 2 x
A 2 q
B 2 y
B 1 x
B 1 y
B 1 z
B 2 x
B 2 y
B 2 y
  1. A Table view in my GDB that specifies the key species per type of area, such as:
Type_area Key_Species
A x
A y
A z
B y

Possible Solution

Use a Make Query Table with the following SQL statement:

SELECT * FROM Measurements A
LEFT JOIN Key B
ON B.Key_Species = A.Species
CASE
    WHEN B.Key_Species IS NULL THEN 0
        ELSE 1
END AS KEY_COUNT;

Which according to ArcGIS Pro is a valid statement. Using a left join I wanted to find the species that are also present as key species in the other table. In the end, summarizing on KEY_COUNT would yield my result. However, when running results in the error 'invalid SQL statement'. This might come from ArcGIS Pro using a different dialect. The documentation did not solve this issue.

desired resulting table (does not have to be spatial)

ID_square Type_area Species_Count Key_Species_Count
A 1 3 3
A 2 2 1

etc.

Best Answer

I agree with others' comments that file geodatabase SQL support isn't very robust, and it likely isn't possible to do what you want solely through SQL. Since you are open to ArcPy-based solutions, here is an Python dictionary-lookup approach that should work.

>>> import arcpy
>>> import collections
>>> 
>>> # Load measurements table into dictionary with tuple of
>>> # "ID_Square" and "Type_area" as the key
>>> measurements_dict = collections.defaultdict(list)
>>> measurements_table = "" # path to table containing measurements
>>> measurements_fields = ("ID_square", "Type_area", "Species")
>>> with arcpy.da.SearchCursor(measurements_table, measurements_fields) as cur:
...     for row in cur:
...         measurements_dict[row[0:-1]] += row[-1]
...
>>> # Load key species table into dictionary with "Type_area"
>>> # as the key
>>> key_dict = collections.defaultdict(list)
>>> key_table = ""  # path to table containing key species
>>> key_fields = ("Type_area", "Key_Species")
>>> with arcpy.da.SearchCursor(key_table, key_fields) as cur:
...     for row in cur:
...         key_dict[row[0]] += row[-1]
...
>>> # Loop through the measurements dictionary by "ID_Square" and "Type_Area"
>>> # pairs/tuples and compare the species present with key species dictionary
>>> for k,v in measurements_dict.items():
...     species_count, key_species_count = [],[]
...     for i in v:
...         species_count += i
...         if i in key_dict[k[0]]:
...             key_species_count += i
...     print(f"{k}, {len(set(species_count))}, {len(set(key_species_count))}")
...
...
('A', 1), 3, 3
('A', 2), 2, 1
('B', 2), 2, 1
('B', 1), 3, 1
>>>

I am a little unclear how to handle situations where the same key species occurs multiple times in a given ID_Square and Type_Area. The code above counts each occurrence and increments the key species counter. If you only want a given species counted once no matter how many occurrences, then a tweak will have to made to code.

UPDATE: Code block modified to count presence of species and not occurrences of species.

Related Question