GAMs Plotting – Plotting Generalized Additive Models on Response Scale with Multiple Smooth and Linear Terms

generalized-additive-modelmgcv

Here is a quick reproducible script to plot GAMs on the response scale instead of the "smooth" scale for a single term via the 'mgcv' R package.

I made the below script for a friend who is using GAMs for his PhD. work in evolutionary biology. I use the Hubble data from the 'gamair' R package simply as a demonstration.

##### Example of GAM Plotting on Response Scale #####

### Load required packages ###

library(gamair)
library(mgcv)

### Data setup ###

data(hubble)
hubble # 24 observations

x <- hubble$x # predictor
y <- hubble$y # response


### GAM setup ###

mod <- gam(y ~ s(x)) # thin plate GAM with k = 10 degrees of freedom (by default)

plot(mod) # smooth term is on y-axis

### GAM prediction ###

pd <- data.frame(x = seq(1, 24, by = 0.1)) # fine grid of points
pr <- predict(mod, newdata = pd, type = "response", se = TRUE) # get predicted response values from GAM


### GAM plotting ###

with(hubble, plot(x, y, ylim = c(0, 2000))) # plot data
lines(pd$x, pr$fit) # plot predicted fit
lines(pd$x, pr$fit - qnorm(0.975) * pr$se.fit, lty = 2) # plot lower 95% CI endpoint
lines(pd$x, pr$fit + qnorm(0.975) * pr$se.fit, lty = 2) # plot upper 95% CI endpoint

The above script works like a charm. The difficulty arises when the GAM contains multiple terms, even though only a single term is plotted at a time.

By a GAM with multiple terms I mean something like

mod <- gam(y ~ s(x) + z).

Here 'z' is linear term (not a smooth term).

Could someone (@gavinsimpson?) provide a quick example of plotting such a GAM on the response scale?

I haven't been able to find such an example online or in Simon Wood's great book on GAMs and mgcv.

Best Answer

If the model contains z then the effect of x estimated by the model is that given z is in the model. Hence the fitted response is the additive sum of the two effects, and we can't talk generally about the estimated values of the response for a range of values of x without also stating the value of z.

For Gaussian models, you can just add on the intercept in plot.gam() to shift around the smooth curve on the y-axis. See argument shift to plot.gam(). This assumes that as per the example x and z are unrelated in the model, and furthermore some value of z (in this case I think 0 as it is a linear term not subject to identifiability constraints).

A more general solution is just to predict from the model at a grid of values over x while holding z constant at some representative value, say its mean or median.

Here's a full example of doing this by hand:

library("mgcv")
library("ggplot2")

set.seed(1)
df <- gamSim()
m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML")

new_data <- with(df, expand.grid(x2 = seq(min(x2), max(x2), length = 200),
                                 x0 = median(x0),
                                 x1 = median(x1),
                                 x3 = median(x3)))

ilink <- family(m)$linkinv
pred <- predict(m, new_data, type = "link", se.fit = TRUE)
pred <- cbind(pred, new_data)
pred <- transform(pred, lwr_ci = ilink(fit - (2 * se.fit)),
                        upr_ci = ilink(fit + (2 * se.fit)),
                        fitted = ilink(fit))

ggplot(pred, aes(x = x2, y = fitted)) +
  geom_ribbon(aes(ymin = lwr_ci, ymax = upr_ci), alpha = 0.2) +
  geom_line()

producing

That script should be fine for any of the standard family options in mgcv, but you'll have to take careful note of what predict() returns for some of the fancier families in mgcv.

Related Solutions

Solved – GAM selection when both smooth and parametric terms are present

You could do what you want for linear terms using the paraPen argument to gam(), which allows penalties on parametric terms.

However, why not treat the linear terms as low-degree smooths (say k = 3) and let the double penalty work on it too?

For the categorical terms, I'd just leave them alone; I'm not sure it is possible to apply a group penalty to categories using paraPen. For something like year, it is highly unlikely that it will have a zero effect (all years exactly the same). I'd be inclined to either:

treat year as categorical and just leave it alone penalty-wise, so you control for between year differences in the expectation of the response, or
if you have enough years and you might expect a smooth trend in the data, treat it as smooth s(year).

Solved – Interpretation of Tensor product splines with multiple terms and their standard errors

These are not tensor product splines. What you are fitting are 2d and 3d thin plate splines. These kinds of smooths are generally inappropriate for the models you fit here because they assume isotropy; that a unit change in one marginal covariate is the same as a unit change in the other marginal covariate and that therefore a single smoothness penalty is appropriate.

Tensor product splines are produced by the te() function (or t2() for a different parameterisation).

Apart from the first vis.gam() call, you are trying to visualise the partial effect of two covariates conditional upon the others (the other covariates are generally held at some typical value, such as their mean). In the same way that including or removing a term in a simple regression model changes the estimated effect (the partial effect) of a covariate in both models (unless the covariates are orthogonal), changing what terms are in the GAM causes changes in the estimated partial effects of those covariates.

This is compounded here because you are removing or including terms from a single high-dimensional smooth, and furthermore, you are not changing the order of the terms in the model independently of which terms are in the model.

Compare

library('mgcv')
data(airquality)

m1 <- gam(Ozone ~ s(Day, Month, Temp), data=airquality, method = 'REML')
m2 <- gam(Ozone ~ s(Month, Day, Temp), data=airquality, method = 'REML')

> printCoefmat(summary(m1)$s.table)
                     edf Ref.df      F   p-value    
s(Day,Month,Temp) 13.495 16.379 9.9337 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> printCoefmat(summary(m2)$s.table)
                     edf Ref.df      F   p-value    
s(Month,Day,Temp) 13.494 16.379 9.9338 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

There are some slight differences but fitted values and summaries compare to the the first few decimal places. These difference are likely due to the way the 3d thin plate spline basis is being set up — it is a low-rank thin plate spline.

If we switch to tensor product smooths then the two models are the same regardless of order (note I have to set k lower here as it applies to each marginal so we have k^3 terms here prior to application of identifiability constraints).

m1 <- gam(Ozone ~ te(Day, Month, Temp, k=4), data=airquality, method = 'REML')
m2 <- gam(Ozone ~ te(Month, Day, Temp, k=4), data=airquality, method = 'REML')

> all.equal(fitted(m1), fitted(m2))
[1] TRUE

Best Answer

Related Solutions

Solved – GAM selection when both smooth and parametric terms are present

Solved – Interpretation of Tensor product splines with multiple terms and their standard errors

Related Question