[GIS] How to enclose clustered points with polygon

arcgis-10.0convertpolygonraster

I have a raster with forest (1), no forest (2) data. "1" is represent single/few trees and there is gaps between them. I want to create a polygon around these trees to a forest map. It is not a dense forest, so it is not possible to filter the data to fill in the gap between trees.
Is there any tools in ArcMap 10.0 or extensions that could work to use? I have seen a tool doing this a while ago, don't remember if it was a ArcMap tool/extension or if it was in another GIS software.

Regards

Best Answer

Your solution ought to depend on your understanding of the data and what you want to present.

Kernel Density

The question's use of "dense" suggests the underlying concept defining a "forested" area is tree density. Why not, then, compute a kernel density map of the trees and define forested areas as all cells exceeding a fixed threshold density? This allows (essentially) two user-selectable parameters: the kernel radius and the threshold.

One might object that Spatial Analyst does not support kernel density calculations for raster data and that the conversion from raster to point data followed by a kernel density might be very slow and cumbersome. But there's a simpler direct solution:

  1. Convert the grid into a binary indicator where 1 = trees, 0 = no trees. (Subtracting the grid from 2 will do this.)

  2. Compute a focal mean or weighted focal mean. It's probably best to use a circular neighborhood.

The radius of the neighborhood is the kernel radius. Using a focal mean is equivalent to a "simple kernel." To reproduce the effects of other kernel functions, use a weighted neighborhood. (This is more complicated to set up--a weight file has to be created first--but the execution should be just as fast.)

Morphological Operators

A less conceptually grounded solution, but requiring only one quick raster operation, is to expand the tree cells until they appear to join into a small number of continuous forested areas. (This is the raster equivalent of buffering the tree points.) If the expansion is fairly large (causing obviously non-forested regions to be covered), the resulting patches can be contracted again by the same amount.

Distance Buffering

The morphological operations, if they must extend over many cells, might be less efficient than just computing the Euclidean Distance grid for the forested cells and then selecting all locations within a threshold distance. (This is the raster equivalent of buffering the trees followed by merging the buffers.)

A more sophisticated solution along these lines can be developed by computing a cost distance or path distance grid with the trees as the origin: in this fashion, the buffering radius can be made to vary with other aspects such as the terrain, soil type, and natural barriers like wide rivers. This approach is tantamount to developing a scientific model relating the presence of forest to other factors.

Statistical Modeling

In the same modeling spirit, one can run a logistic regression to predict the presence of trees based on other variables such as terrain, soil type, insolation, and so on, and then apply the fitted model to create a grid of tree probabilities. Selecting the high-probability cells will create the desired forest patches. This kind of statistical work is best carried out with a statistical package (such as R with one or more of its spatial libraries installed) rather than in ArcGIS, so I will not discuss this further here.

Other Solutions

An extensive thread on concave hulls shows how to delineate groups of points. To apply these algorithms (which themselves can be slow), first convert the grid to a vector dataset of forested cell points. Most of these solutions would approximately reproduce the morphological solution previously described where an expansion is followed by an equivalent contraction.

General Comments

It is best to perform raster operations on raster data, rather than attempting to convert back and forth to vector representations. If a vector representation is desired as output (for instance, if vector polygons are needed), then perform this conversion at the end of the processing. Following this principle tends to create workflows that are efficient both computationally and for the analyst (it often minimizes resampling errors that creep in during conversions, too). This is evident in the simplicity and shortness of the several raster-only solutions suggested here: they require at most two raster operations (and only one if the forested grid had originally been in binary indicator form, which is more convenient for analyses like these).

Related Question