[GIS] How to exclude outliers from raster using ArcGIS for Desktop

arcgis-10.0arcgis-desktopoutlierraster-calculatorspatial-analyst

I calculated terrain curvature from a DEM. The DEM I use is interpolated from point elevation data. Generally, the DEM seems to be good, but some artifacts from interpolation exist. Those artifacts cause extreme values for curvature. Around 90% of the curvature values are between -25 and 25, but the other 10% are between -500 and 500.

I can use histogram equalize to properly visualize the curvature raster. For further analysis however I'd like to replace this extreme values.

So my question is: How can I exclude this extreme values from the raster?
I imagine something like this would work (pseudo-code):

m = median("curvature")
IF("curvature.value") > m * 10 THEN ("curvature.value") = m * 10
IF("curvature.value") < m * 10 THEN ("curvature.value") = m * 10

What tool would I use for this? I tried Raster Calculator, but I couldn't get the syntax right.

Software: ArcGIS 10 with spatial and 3D analyst

Best Answer

I wouldn't clamp the values, as suggested by the pseudocode, because the curvature artifacts do not necessarily correspond to high-curvature areas. Better cures for the problem include:

  1. Consider smoothing the DEM before computing the curvature. This would slightly reduce most curvatures but might give reasonable curvature values at all locations. Smoothing the DEM can be done with local medians or means.

  2. Find a robust way to estimate curvature. (You won't find this in any Arc* software, but with the availability of general-purpose tools like R, this is a feasible recommendation.) In general, curvature is estimated by fitting a quadratic (or some specialized low-order polynomial) in each neighborhood. The usual fitting method is least-squares, which can be sensitive to slight local variations. By expanding the neighborhood and using a fitting method that downweights outlying residuals (such as IWLS), you might be able to get representative and accurate curvature estimates everywhere.

  3. Replace the extreme curvature values with NoData. This is still a draconian measure, but would be appropriate if the extreme values are really artifacts of the interpolation (and don't, therefore, reflect truly high curvatures).

  4. Choose an interpolation method that produces smoother surfaces!

The 'map algebra' syntax for replacing values with NoData uses the SetNull command, as in

SetNull(Abs([Curvature grid]) > 10*Abs(m), [Curvature grid])

By the way, m should not be the median curvature: the median often will be close to zero and won't reflect typical large curvatures (positive or negative) A large multiple of the median absolute curvature might be a reasonable choice. Better yet, examine the histogram of curvatures to find where to cut off the values.

If the SetNull approach pokes too many NoData holes in the grid, think about whether the curvature grid should be smoothly varying. If so, and if those holes tend to be isolated (or in narrow stripes--which would be a strong indicator that a poor interpolator was used), you can fill them by computing a focal mean and pasting the mean values over the holes. The pasting is done with the conditional operator, as in

Con(IsNull([Modified curvature grid]), [Focal mean grid], [Modified curvature grid])

Note the use of IsNull to detect the NoData cells.