Solved – the nugget effect

gaussian processgeostatisticskrigingspatialvariogram

I don't understand exactly what is meant by the term "nugget effect" in geostatistics. When looking at empirical variograms plotting the variogram $\gamma(h)$ vs. the lag $h$, the nugget is defined as the discontinuity from the origin when the lag $h=0$.

enter image description here

When $h=0$ you should also get $\gamma=0$ because you should be getting the same point, but this doesn't always occur in practice.

If you want to do kriging (best linear unbiased interpolation), you need replace the empirical variogram with an appropriate model for the covariance. In this context people also talk about the nugget licit model for the variogram which appears to be defined as follows:
$$
g(h) =
\begin{cases}
0, & \text{if $h=0$ } \\
c, & \text{otherwise}
\end{cases}
$$

where $c$ is the sill or asymptotic value of $\gamma$.

Sometimes people talk about the nugget effect.

nugget effect = sum of geological microstructure and measurement error (source)

or

Typically, only a small portion of the variability is explained by
random behavior. For historical reasons, this type of variogram
behavior is called the nugget effect. In early mining geostatistics,
the presence of gold nuggets in drillhole samples would lead to
apparently random variations—hence, nugget effect. (Source)

In an comment in the answer to this question What effect does data averaging have on the variogram? it seems to suggest that true nugget effect is different from measurement error.

What is the nugget effect? How/why is it different from measurement error?

EDIT

After reading AdamO's response, I found the following helpful passage:

In practice, when the sampling design specifies a single measurement
at each of $n$ distinct locations, the nugget effect has a dual
interpretation as either measurement error or spatial variation on a
scale smaller than the smallest distance between any two points in the
sample design, or any combination of these two effects. These two
components of the nugget effect can only be separately identified if
the measurement error variance is either known, or can be estimated
directly using repeated measurements taken at coincident locations.

(p.57) Diggle, P. J., and P. J. Ribeiro. "Model-based Geostatistics". Springer Series in Statistics. Springer, 2007.

Best Answer

In the context of estimating a variogram, a nugget allows for the variogram to assume a non-zero value for two observations having a distance of zero. The implication also is that the correlation between adjacent observations is reduced slightly.

I might say a nugget is a more general concept than measurement error, at least in geostatistics. Or, they may be unrelated concepts altogether. "Measurement error" is an unclear term in statistics: with many sources of variability, it benefits us to be clear about our presumed sources of "error". Measurement error can be taken to imply that the measurement method is flawed. In the case of blood pressure, we would like to measure the pressure differential between blood leaving the heart and entering the heart. This "gold standard" is far too invasive to be done in practice, so we use an imperfect method that depends on a number of characteristics. The patient's 8 hour diet, their position (seated, laying down), the time of day, the volume of their heart, etc. all predict BP but we ignore these values in practice: I contend these aren't measurement error. However, the reactiveness and training of the administrator, the quality of the cuff affect the quality of the reading while not being a reflection of the actual blood pressure whatsoever: I contend this is measurement error.

In geostatistics, and many other fields, we may conduct the "same measurement" or "nearly the same measurement" twice and expect different results. Presence of a nugget means that any two observations sampled arbitrarily closely will not necessarily have the same value. Not allowing for nugget error may be an undesirable constraint when the design permits collecting data of that nature.

In many studies you simply can't sample the same area with replacement. Take blood pressure as an example, it is impossible to replicate a blood pressure measurement either by time or location on the human body. Even if two arms are measured simultaneously, they will provide a different reading, and if the same arm is measured twice immediately in sequence, the blood pressure fluctuates slightly due to response to environmental temperatures, metabolism, fatigue, duration in resting position, attenuation of the white coat effect, etc. These measures are certainly more serially correlated than measures collected farther out in time, at more distal parts of the body, or even in different people, but they are not perfectly correlated, thus imposing an inappropriate variance structure could be considerably biased.

Related Question