[GIS] Calculating Focal Statistics for Special Neighborhood

fionapythonraster-calculatorshapelyspatial statistics

I'm looking to calculate focal statistics for each cell of a raster, within a neighborhood of a specified criteria.

Background – I have three binary rasters, each representing a single vegetation type of interest. I'd like to calculate the percent coverage of each vegetation type within (e.g.) 20 km^2 of any cell in my study area (sum/total cells in neighborhood). The problem is that I can't use a simple circle or square neighborhood around each cell because, if I did, the search area used to calculate the sum would incorporate areas outside my study area. This exception is important because the statistics will be used as inputs for a habitat model, and the areas outside of my study area cannot be considered possible habitat – they're urbanized. Including them would give me erroneous statistics. So, what I'm looking to do is write some code in python that will choose a neighborhood representing the n nearest cells (n determined by number of cells required to cover an area equal to my desired neighborhood size) that meet my criteria. The criteria being that they do not fall within an urbanized area. I'm thinking that some form of cellular automata should be used. I've never worked with CA though.

I guess what I'd like is something like starter code, or a point in the right direction.


REPLY TO COMMENT BELOW:

Let's say I'm calculating this statistic for a cell on the boundary of my study site. If I assign all areas outside of my study area to zero (or ignore NoData), then I will get a statistic that represents roughly half of the areal coverage I'm interested in. So, percent coverage in a ~10 km^2 area, instead of 20 km^2 area. Since I'm studying home range sizes this is important. The neighborhood has to change shape, since that is how the animal views/uses the landscape. If they need 20 km^2, they'll change the shape or their home territory. If I do not check ignore NoData, cell output will be NoData – and NoData is no help.


"PROGRESS" AS OF 10/24/2014

Here is the code I've come up with so far using Shapely and Fiona:

import numpy as np
import pprint
import shapely
from shapely.geometry import*
import fiona
from fiona import collection
import math

traps = fiona.open('C:/Users/Curtis/Documents/ArcGIS/GIS_Data/occurrence/ss_occ.shp', 'r')

study_area = fiona.open('C:/Users/Curtis/Documents/ArcGIS/GIS_Data/Study_Area.shp', 'r')
for i in study_area: #for every record in 'study_area'
        sa = shape(i['geometry']) #make a variable called 'sa' that is a polygon

grassland = fiona.open('C:/Users/Curtis/Documents/ArcGIS/GIS_Data/land_cover/polys_for_aa/class3_aa.shp', 'r')
pol = grassland.next()
gl = MultiPolygon([shape(pol['geometry']) for pol in grassland])

areaKM2 = 20
with traps as input:
    r = (math.sqrt(areaKM2/math.pi))*1000
    for point in input:
        pt = shape(point['geometry'])
        pt_buff = pt.buffer(r)
        avail_area = pt_buff.intersection(sa).area
        # works to here
        while avail_area < areaKM2:
            r += 10
            pt_buff = pt.buffer(r)
            avail_area = pt_buff.intersection(sa).area

        perc_cov = pt_buff.intersection(gl).area//areaKM2
        print perc_cov

Unfortunately, it's INCREDIBLY slow.

Best Answer

The code above is the eventual, and imperfect, answer I came up for this problem. In the end I thought the best approach was to use a circular neighborhood and calculate the area intersecting my "available" area. (A circular neighborhood would give the n ~closest cells anyway - so, no need to get too fancy with Cellular Automata.) If the area was too small, I just grew the circle until it wasn't.

It worked well but, as I noted, it was very slow. See this thread for suggestions on how to speed it up. Maximizing Code Performance for Shapely. I followed the suggestions, which lead to this thread Understanding the Use of Spatial Indexes. I did not end up applying an r-tree in the end, because I actually never ended up using the code. I found out that I could approach the problem from an entirely different angle and save myself a lot of time/energy, so I did. Maybe I'll finish it one day...

Related Question