R – Calculating Confidence Interval for GAM Model

confidence intervalgeneralized-additive-modelr

Reading mgcv::gam's help page:

confidence/credible intervals are readily available for any quantity
predicted using a fitted model

However I can't figure a way to actually get one. I thought predict.gam would have a type=confidence and a level parameter but it doesn't.
Can you help me on how to create it ?

Best Answer

In the usual way:

p <- predict(mod, newdata, type = "link", se.fit = TRUE)

Then note that p contains a component $se.fit with standard errors of the predictions for observations in newdata. You can then form CI by multipliying the SE by a value appropriate to your desired level. E.g. an approximate 95% confidence interval is formed as:

upr <- p$fit + (2 * p$se.fit)
lwr <- p$fit - (2 * p$se.fit)

You substitute in an appropriate value from a $t$ or Gaussian distribution for the interval you need.

Note that I use type = "link" as you don't say if you have a GAM or just an AM. In the GAM, you need to form the confidence interval on the scale of the linear predictor and then transform that to the scale of the response by applying the inverse of the link function:

upr <- mod$family$linkinv(upr)
lwr <- mod$family$linkinv(lwr)

Now note that these are very approximate intervals. In addition these intervals are point-wise on the predicted values and they don't take into account the fact that the smoothness selection was performed.

A simultaneous confidence interval can be computed via simulation from the posterior distribution of the parameters. I have an example of that on my blog.

If you want a confidence interval that is not conditional upon the smoothing parameters (i.e. one that takes into account that we do not know, but instead estimate, the values of the smoothness parameters), then add unconditional = TRUE to the predict() call.

Also, if you don't want to do this yourself, note that newer versions of mgcv have a plot.gam() function that returns an object with all data used to create the plots of the smooths and their confidence intervals. You can just save the output from plot.gam() in an obj

obj <- plot(model, ....)

and then inspect obj, which is a list with one component per smooth. Add seWithMean = TRUE to the plot() call to get confidence intervals that are not conditional upon smoothness parameter.