In Esri's documentation on how to make a prediction in Kriging, I couldn't figure out the linkage between the semivariograms and the prediction process from reading the description therein:
After you have uncovered the dependence or autocorrelation in your
data (see Variography section above) … you can make a prediction
using the fitted model. Thereafter, the empirical semivariogram is set
aside.You can now use the data to make predictions. Like IDW interpolation,
kriging forms weights from surrounding measured values to predict
unmeasured locations…
Still, that doesn't say much about how kriging forms weights
.
Is it true that we're just using a different set of weight values than the 1/distance
in IDW?
If Kriging really just uses a different weight values than IDW, how can it solve the well-know summit problem for IDW? (whereby IDW cannot predict a higher value outside the range of the control points. e.g. One cannot correctly predict the elevation of a peak in a terrain based on lower elevation values of nearby data points.)
I understand the semivariogram measures average dissimilarity for a given separation distance, and the values are all non-negative.
Can someone explain how the combination/weighting method (e.g. with linear regression or any other variant) on a computed semivariogram works to solve the "summit" problem?
Best Answer
Yes, both IDW interpolation and Ordinary Kriging (OK) will calculate weights based on distance, but with different criteria. In both methods, weights do not depend on sample values. The answer from Dahn Jahn in Ordinary kriging example step by step? is very clarifying about this.
The main difference is that in Ordinary Kriging distances are used to measure the correlation structure among sample points (i.e. how similar they are based on lags of distances) and this correlation structure is captured in the form of the variogram model. The variogram model is used to calculate covariances (or semi variances): i) among sample points and; ii) between sample points and prediction points. These covariances are then, used to calculate weights (see how below).
The main processing steps in Ordinary Kriging are:
Background:
For a detailed mathematical explanation and theoretical insights about Ordinary Kriging refer to materials linked in 'References'.
Assume from bullet number 2 our variogram model is of type exponential:
where:
C(h)
= covariance;a
= nugget effect;σ²
= sill (σ² - a
= partial sill);r
= range.We want to predict a value so that
Ŷ0 = wiyi
whereŶ0
is the predicted value at location0
,wi
is the weight from samplei
andyi
is the observed value from samplei
.The idea of kriging is that its estimator
Ŷ0
is unbiased (i) and the estimation variance is minimized (ii):E(Ŷ0) = E(Y0)
. For this to happen,∑wi = 1
and the mean is stationary (i.e.E(Yi) = u
, given anyi
).σ²_ε = E[Y0 - Ŷ0]² = Var(Y0 - Ŷ0)
is minimized.In order to minimize
σ²_ε
, the Lagrange multiplier method is used:where
L
is the Lagrange function,λ
is the Lagrange multiplier and2λ(∑wi - 1)
is the part that guarantees∑wi = 1
.Cij
is the covariance among sample pointsi
andj
andCi0
is the covariance between sample pointi
and prediction point.Now, taking partial derivatives from
L
with respect towi
andλ
, setting the equations to zero, and solving them, will result in (skipped all calculations through the last step):Example:
Assume the following sample points (1, 2 and 3) and the prediction point (0), their respective values and distances:
Assume the covariance function with nugget
a = 1
, sillσ² = 4
and ranger = 6
:Calculate covariances among sample points:
Calculate covariances between sample points and prediction point:
Calculate weights:
So,
w1 = 0.353
,w2 = 0.353
,w3 = 0.293
and the Lagrange multiplier =-0.505
. See that∑wi = 0.353 + 0.353 + 0.293 = 1
.Predict value in point 0 (
Ŷ0
):Calculate kriging variance (aka minimized estimation variance):
Same example, in R:
Besides Kazuhito's answer, this is also addressed in Edzer Pebesma's answer to Kriging values greater than the most extreme value of the input layer?.
References: