Solved – Gaussian Process smooths in mgcv: choosing between spherical and exponential covariance functions

mgcvr

A colleague of mine explained that in variography, the exponential covariance function often does a better job at fitting to spatial data and generating accurate predictions than does the spherical covariance function.

Has there been any work done on Gaussian process regression (AKA kriging) as it exists in the 'mgcv' R package? I can't seem to find anything in Simon Wood's mgcv text.

I am fitting some spatial GAMs and am tempted to select the exponential covariance function over the spherical one based on my colleagues remarks.

Should I instead be testing both covariance functions using gam.check/AIC etc. before making a final call? I am fitting a lot of simulated spatial datasets, so repeatedly checking between covariance functions is tedious and time-consuming.

Best Answer

The gp smooth type is only discussed in the second edition of Simon's book as it was added to the mgcv long after the first edition went to press.

The main difference to consider is that spherical covariance function is not entirely smooth; there is a discontinuity which can pass through to the resultant smoother. The Matérn and power exponential functions do not suffer from this problem.

The discontinuity is due to the spherical covariance function taking a value if 0 if the distance between two points is greater than $\rho$, the range parameter:

$$ c(d)=\begin{cases} 1 - 1.5d / \rho + 0.5(d/\rho)^2, & \text{if $d \leq \rho$}.\\ 0, & \text{otherwise}. \end{cases} $$

where $d$ is the distance between a pair of points. We can see the effect of this in the following R example, modified from ?smooth.construct.gp.smooth

library("mgcv")
set.seed(24)
eg <- gamSim(2, n = 300, scale = 0.05)
b  <- gam(y ~ s(x, z, bs= "gp", k = 50, m = c(3, 0.175)), data = eg$data, method = "REML") ## Matern spline
b1 <- gam(y ~ s(x, z, bs = "gp", k = 50, m = c(1, 0.175)), data = eg$data, method = "REML") ## spherical 
b2 <- gam(y ~ s(x, z, bs = "gp", k = 50, m = c(2, 0.175)), data = eg$data, method = "REML") ## exponential 

op <- par(mfrow=c(2,2), mar = c(0.5,0.5,3,0.5))
with(eg$truth, persp(x, z, f, theta = 30, main = "Truth")) ## truth
vis.gam(b, theta=30, main = "Matern")
vis.gam(b1, theta=30, main = "Spherical")
vis.gam(b2, theta=30, main = "Exponential")
par(op)

For the same value of $\rho$ we still recover a smooth surface with the Matern and exponential covariance functions unlike with the spherical one. That said, using a profile likelihood approach to determine an optimal value for $\rho$ should result in a fit closer to the Matern or exponential fits show in the figure

## slightly modified from Wood (2017, pp. 362)
REML <- r <- seq(0.1, 1.5, by = 0.05)
for (i in seq_along(r)) {
  m <- gam(y ~ s(x, z, bs = "gp", k = 50, m = c(1, r[i])), data = eg$data, method = "REML")
  REML[i] <- m$gcv.ubre
}

plot(REML ~ r, type = "o", pch = 16, ylab = expression(rho))

There is a minimum in the REML score at $\rho = 0.65$ in this sequence of values tried, which, I understand is close to the true effective range of ~ 0.7 for this example, and which results in a fit that is smooth like the Matern and exponential fits show earlier

I doubt you'll find too much difference between the various covariance functions. The main issue will be to set up a profile likelihood loop to fit the GAM for a series of values for the range parameter or each function and to choose the power parameter. Then go with the parameters of the covariance function that results in lowest value of the REML or ML score. Example code above shows how this can be done for $\rho$.

The second edition of Simon's book has an example of this. In the spatial context you might get away with the defaults - which basically say that the effective range of the spatial correlation is as large as the maximal distance between the pairs of observations - but it is a relatively simple thing to check with a loop over a range of values for this parameter.

I'm not aware that you can use gam.check to fully diagnose issues with the GP smoothers; you need to specify values of any required parameters for the covariance functions for these smoothers to be estimated using the quadratic penalty approach adopted in mgcv. The output from gam.check will confirm if you have the right basis dimension but you still need to optimise the other parameters for the covariance function you are using.

Wood, S. N. (2017). Generalized Additive Models: An Introduction with R, Second Edition. CRC Press.

Best Answer

Related Solutions

Solved – Difference between s() and ti() terms in mgcv package when applied to one variable

Related Question