[GIS] Interpolating Values in ArcMap

arcgis-desktoparcmapinterpolationspatial-analyst

I wanted to interpolate the trend values (trend of sunshine hour calculated over 20 years) throughout the study area from the available station points.

Can you help me with the selection of interpolation tools to use in ArcMap to do that (IDM, Kriging, Natural Neighbor, spline and all that)?

Best Answer

I want to give you some hints about the differences in the methods.

More information can also be found on the esri help pages An overview of the Interpolation toolset

Because your variable (sunshine) depends on a second variable (level of pollution) Kriging may be a good method. You can use your second variable as an “external drift”. Kriging requires very deep knowledge of geostatistic.

If you start with interpolation analysis I would start with simpler methods: IDW, Spline (with barriers ) or Natural Neighbor.

Natural Neighbor does not support barriers. I believe that barriers are important for you.

So let's look at some differences between IDW and Spline with barriers:

Spline:

using a two-dimensional minimum curvature spline technique
creates a smooth surface
The resulting surface passes exactly through the input points. But: Usually some values of the resulting surface can be higher than the maximum of the input values (and lower than the minimum of the input values). You have to decide if that is possible for you. I think that it's good for you because your stations are probably not right there where in reality the minimum or maximum values occurs. Try it and see if the calculated minima and maxima are meaningful.

IDW:

less smooth surface than spline
best result when our station distribution is dense (regard to the local variation of sunshine)
The maximum value of the resulting surface is never higher than the maximum of the input value. The minimum value of the resulting surface is never lower than the minimum of the imput value.
The resulting raster is a convex hull. Therefore, the border of your area of interest may not be entire covered by the resulting raster.

I would start with Spline with barriers. You can use Spatial Analyst for this. The profile tool of 3D Analyst can help to examine the results.

Related Solutions

Arcpy – How to Exclude Stations from IDW Interpolation Using Arcpy

I would like to share a solution that works with any "black-box" implementation of IDW. It takes no programming and requires only two applications of IDW and a (simple, fast) grid division. This means it will take twice as long as need be, but its universal application, its simplicity, and the lack of any need to modify the IDW code can be great advantages in practice.

Recall that IDW (with exponent p, usually p = 2) predicts the value Z at any location P in terms of observed values Z(1), Z(2), ..., Z(n) at other locations P(1), P(2), ..., P(n) respectively with the formula

Z = [w(1)*Z(1) + w(2)*Z(2) + ... + w(n)*Z(n)] / w,
w(i) = d(i)^(-p),
w = w(1) + w(2) + ... + w(n).

The numbers d(i) are the distances between P and P(i). Let's call the w(i) the "raw weights." (The actual weights multiplying the values are w(i)/w.)

Suppose the first k of the observed values are missing. Applying IDW to the non-missing values yields

Z' = [w'(k+1)*Z(k+1) + ... + w'(n)*Z(n)] / w',
w'(i) = d(i)^(-p) = w(i)
w' = w'(k+1) + ... + w'(n) = w(k+1) + ... + w(n).

Because the new raw weights w'(i) are equal to the original raw weights w(i), we may drop the primes on the w'(i). We can force this new formula to look a lot like the first one by setting the missing values to zero: Z(1) = Z(2) = ... = Z(k) = 0. Then

Z' = [w(1)*0 + w(2)*0 + ... + w(k)*0 + w(k+1)*Z(k+1) + ... + w(n)*Z(n)] / w'.

This is exactly what IDW would give us when applied to all n values (with missings set to zero) except for the w' in the denominator rather than the original w. The difference is that w' is the sum of the raw weights for the non-missing values only, whereas w is the sum of the raw weights at every location. If we knew the ratio w'/w, we could divide the IDW value for Z' by this ratio to obtain what we need. How to find this ratio? A little inspection of the formulas suggests this trick: set the missing values to zero and the non-missing values to 1. That is, let Z(1) = Z(2) = ... = Z(k) and Z(k+1) = Z(k+2) = ... = Z(n) = 1. Applying IDW to this synthetic dataset yields

[w(1)*Z(1) + w(2)*Z(2) + ... + w(n)*Z(n)] / w
= [w(1)*0 + ... + w(k)*0 + w(k+1)*1 + ... + w(n)*1)] / w
= [w(k+1) + w(k+2) + ... + w(n)] / w
= w'/w.

We have it!

The solution therefore is this:

Calculate IDW values based on the artificial data created by setting all missing values to zero and all non-missing values to one. As we just saw, this grid contains the ratios w'/w.
Calculate IDW values at the same locations based on setting all missing values to zero and keeping all non-missing values as they should be.
Divide the second set of values by the first set.

This solution does not need any special NoData values in the attribute table: usually, missing values are indicated by some special numerical code, such as -9999. Replacing missing values or non-missing values by constants is merely a matter of matching this code and overwriting the values by the desired constant.

Notice that step (1) does not have to be done at every date: it only has to be performed for each configuration of missing values. For instance, if the same monitoring locations are missing for an entire year, then steps (2) and (3) have to be done for each date in that year but step (1) only needs to be computed once for that year.

Finally, please note that this analysis has assumed IDW will use the same neighborhoods in any case. This will happen with a fixed-distance neighborhood search (or when all the locations are always used). It will not happen with a variable neighborhood search.

DEM Interpolation Methods – Linear vs. Natural Neighbors: Detailed Comparison

I regret that I didn't see your post back when you wrote it. But I've made a picture that I think shows the essential difference between the two. Both interpolations produce continuous surfaces and both will preserve the original values (so that if you take a query point at the position of an input sample point, you get back the original value associated with that sample). The biggest difference is in the way slope is handled. In linear, the slope is constant over the entire area of one of the triangular facets. In natural neighbors it changes continuously. This characteristic results in linear "artifacts" in the linear interpolation. On the other hand, natural neighbors has a more curved appearance. Also, in linear, the slope is discontinuous across the edges of the triangular facets. In natural neighbors, it is continuous everywhere except right on top of the input sample points.

Recently, I posted a wiki article about natural neighbors, you can find it at The Tinfor Project's Introduction to Natural Neighbor Interpolation.

Here's the image. Linear interpolation is on the left, natural neighbors on the right.

Best Answer

Related Solutions

Arcpy – How to Exclude Stations from IDW Interpolation Using Arcpy

DEM Interpolation Methods – Linear vs. Natural Neighbors: Detailed Comparison

Related Question