Solved – When and why would you not want to use a GAM

generalized-additive-modelmgcv

I understand that GAMs in mgcv have the ability to reduce s(x) to a linear relationship with the response variable. If this is the case, why wouldn't you use a GAM?

When fitting smooth terms in gam() using s(x) my understanding is that if the relationship between x and the response variable is linear then a linear relationship will be fitted. Similarly, if the relationship between x and the response variable is non-linear then an appropriate non-linear relationship will be fitted. This is demonstrated when looking at summary(my_gam) – if s(x) has a linear relationship with the response variable the effective degrees of freedom (edf) in summary(my_gam) is 1 or approximately 1, indicating a linear relationship.

Threfore, I don't understand why you wouldn't use a GAM if it can model linear relationships should they exist, but also model non-linear relationships should these exist.In other words, why use a GLM over a GAM if you must make additional assumptions about relationships between the response and predictor – assumptions that do not need to be made when using a GAM?

Its seems that a GAM can do everything that a GLM can and more, but it doesn't do the 'and more' bit (i.e fit non-linear relationships) unless it is needed.

Best Answer

You could take this to extreme and ask why wouldn't we use non-parametric model like $k$-NN regression? Actually, the opposite question Why would anyone use KNN for regression? was asked, and you can check it for more detailed discussion. You can also make the question more broad and ask why wouldn't we use more complicated models instead of simpler ones? For example, why would anyone use logistic, or linear regression, if they could use a neural network?

The two main reasons for preferring simple models are:

Interpretability. Simple models like linear regression are directly interpretable, while this does not have to be the case of more complicated models. This may be desirable in some disciplines (e.g. medicine), and even obligatory by law in others (finance).
Overfitting. More complicated models are more prone to overfitting, especially for small sample sizes. Complicated model may simply memorize the training dataset and not generalize.

As noticed in the comments, this seems to be also discussed in the following thread: When to use a GAM vs GLM.

As a comment, notice that using model that is linear in parameters is not that a big constraint. You can easily extend a linear model using polynomial components to model complex relationships, and this may even outperform neural networks in some cases (see Cheng et al, 2018 [arXiv:1806.0685]).

Related Solutions

Solved – When to use a GAM vs GLM

The main difference imho is that while "classical" forms of linear, or generalized linear, models assume a fixed linear or some other parametric form of the relationship between the dependent variable and the covariates, GAM do not assume a priori any specific form of this relationship, and can be used to reveal and estimate non-linear effects of the covariate on the dependent variable. More in detail, while in (generalized) linear models the linear predictor is a weighted sum of the $n$ covariates, $\sum_{i=1}^n \beta_i x_i$, in GAMs this term is replaced by a sum of smooth function, e.g. $\sum_{i=1}^n \sum_{j=1}^q \beta_i \, s_j \left( x_i \right)$, where the $s_1(\cdot),\dots,s_q(\cdot)$ are smooth basis functions (e.g. cubic splines) and $q$ is the basis dimension. By combining the basis functions GAMs can represent a large number of functional relationship (to do so they rely on the assumption that the true relationship is likely to be smooth, rather than wiggly). They are essentially an extension of GLMs, however they are designed in a way that makes them particularly useful for uncovering nonlinear effects of numerical covariates, and for doing so in an "automatic" fashion (from Hastie and Tibshirani original article, they have 'the advantage of being completely automatic, i.e. no "detective" work is needed on the part of the statistician').

Solved – Plotting GAMs on Response Scale with Multiple Smooth and Linear Terms

If the model contains z then the effect of x estimated by the model is that given z is in the model. Hence the fitted response is the additive sum of the two effects, and we can't talk generally about the estimated values of the response for a range of values of x without also stating the value of z.

For Gaussian models, you can just add on the intercept in plot.gam() to shift around the smooth curve on the y-axis. See argument shift to plot.gam(). This assumes that as per the example x and z are unrelated in the model, and furthermore some value of z (in this case I think 0 as it is a linear term not subject to identifiability constraints).

A more general solution is just to predict from the model at a grid of values over x while holding z constant at some representative value, say its mean or median.

Here's a full example of doing this by hand:

library("mgcv")
library("ggplot2")

set.seed(1)
df <- gamSim()
m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML")

new_data <- with(df, expand.grid(x2 = seq(min(x2), max(x2), length = 200),
                                 x0 = median(x0),
                                 x1 = median(x1),
                                 x3 = median(x3)))

ilink <- family(m)$linkinv
pred <- predict(m, new_data, type = "link", se.fit = TRUE)
pred <- cbind(pred, new_data)
pred <- transform(pred, lwr_ci = ilink(fit - (2 * se.fit)),
                        upr_ci = ilink(fit + (2 * se.fit)),
                        fitted = ilink(fit))

ggplot(pred, aes(x = x2, y = fitted)) +
  geom_ribbon(aes(ymin = lwr_ci, ymax = upr_ci), alpha = 0.2) +
  geom_line()

producing

That script should be fine for any of the standard family options in mgcv, but you'll have to take careful note of what predict() returns for some of the fancier families in mgcv.

Best Answer

Related Solutions

Solved – When to use a GAM vs GLM

Solved – Plotting GAMs on Response Scale with Multiple Smooth and Linear Terms

Related Question