Solved – Gam with low E.D.F (estimated degrees of freedom) value in main effect, not interaction term

generalized-additive-modelmgcvsemiparametric

I have a gam model with the following structure:

gam(sv ~ s(day, bs="ts") + s(range, bs="ts") + s(time, bs="cc") + 
      ti(day, range, bs=c("ts", "ts")), data=train.all, method="REML")

In this model's summary range has an e.d.f of 0.95, and day,range has an e.d.f of 2.8

I have seen before that if a model term has an e.d.f close to 1 that you should change it to be a normal parametric term. I have two questions:

In this case, when the main effect appears parametric, do I also modify the interaction term between day and range to have range as parametric, or can you have a main effect that is parametric and in an interaction term it is a smooth?
I am using "ts" basis, so sometimes I receive values below 1 e.d.f that are more like 0.85 e.d.f. Does this still indicate a parametric term? I am a little confused how the value is slightly below 1, but not really close to 0.

Thank-you very much,
Hannah

Best Answer

Q1

No, leave it in as an s() term; part of the wiggliness in range is taken up in the ti(day, range) term. In fact it would seem sensible to decide whether ti(day, range) is needed and if it is, refit the model with te(day, range) rather than the separate s() terms plus the ti() term, if only for the reduction in the number of smoothing parameters that need to be estimated (from 8 with your model to 4 for the te() version - which are doubled from the 4 or 2 parameters I think because of the shrinkage thin-plate splines you are using).

Q2

Values like this are common when you turn on shrinkage (either via shrinkage splines or via the double-penalty approach using select = TRUE. Both options allow shrinkage of the null space of the penalty matrix. This null space include the set of basis functions that are completely smooth. Without the shrinkage, the smoothness parameters would shrink the wiggly parts of the model until all that was left was the linear function. In that case the range space is shrunk by the smoothness parameter but not the null space.

By adding an extra penalty, in your case by adding a small value to the zero eigenvalues of the penalty matrix, the null space can now also be shrunk via a second penalty and associated smoothness parameter. This allows for the linear part of the spline to be shrunk a little (or a lot) towards the model intercept. If you're familiar with the LASSO penalty you will notice a similarity; the shrunken linear function doesn't have it's full least-squares estimated effect but a somewhat smaller effect because of the shrinkage towards zero. The penalty is different in splines but the effect is similar. The less-than-1 EDF for the term just reflects this.

You can often find spline terms with EDF <1 that are not linear. In that case it seems that the null space (the linear, perfectly smooth bits) has been shrunk as well as the range space (the wiggly bits) but the smoothness parameter(s) for the wiggly bit allow some small amount of wiggliness and the shrinkage to the null space drags the EDF below 1. This is fine; the model is estimating a slightly wiggly function but uncertainty in that estimate probably means that a linear function will reside in the 95% confidence interval for the smooth.

Related Solutions

Generalized Additive Model – Understanding GAM Smoother vs Parametric Term and Concurvity Difference

The concurvity moves from the stated smooth terms to the parametric terms, which concurvity groups in total under the para column of the matrix or matrices returned.

Here's a modified example from ?concurvity

library("mgcv")
## simulate data with concurvity...
set.seed(8)
n<- 200
f2 <- function(x) 0.2 * x^11 * (10 * (1 - x))^6 + 10 *
            (10 * x)^3 * (1 - x)^10
t <- sort(runif(n)) ## first covariate
## make covariate x a smooth function of t + noise...
x <- f2(t) + rnorm(n)*3
## simulate response dependent on t and x...
y <- sin(4*pi*t) + exp(x/20) + rnorm(n)*.3

## fit model...
b <- gam(y ~ s(t,k=15) + s(x,k=15), method="REML")

Now add a linear term and refit

x2 <- seq_len(n) + rnorm(n)*3
b2 <- update(b, . ~ . + x2)

Now look at the concurvity of the two models

## assess concurvity between each term and `rest of model'...
concurvity(b)
concurvity(b2)

These produce

> concurvity(b)
                para       s(t)      s(x)
worst    1.06587e-24 0.60269087 0.6026909
observed 1.06587e-24 0.09576829 0.5728602
estimate 1.06587e-24 0.24513981 0.4659564
> concurvity(b2)
              para      s(t)      s(x)
worst    0.9990068 0.9970541 0.6042295
observed 0.9990068 0.7866776 0.5733337
estimate 0.9990068 0.9111690 0.4668871

Note that x2 is essentially a noisy version of t:

> cor(t, x2)
[1] 0.9975977

and hence the concurvity is gone up from essentially 0 in b to almost 1 in b2.

Now if we add x2 as a smooth function instead...

concurvity(update(b, . ~ . + s(x2)))

we see that the para entries return to being very small and we get a measure for the spline term s(x2) directly

> concurvity(update(b, . ~ . + s(x2)))
                 para      s(t)      s(x)     s(x2)
worst    1.506201e-24 0.9977153 0.6264654 0.9976988
observed 1.506201e-24 0.9838018 0.5893737 0.9963857
estimate 1.506201e-24 0.9909506 0.4921592 0.9943990

This is just how the function works in terms of the parametric terms; the focus is on the smooth terms.

Note: ~~you are specifying gamma but fitting using REML. gamma only affects GCV and UBRE/AIC methods of smoothness selection, so you can remove this argument as it is having zero effect on the model fits.~~ From version 1.8-23 of mgcv, the gamma argument no also affects models fitted using REML/ML, where smoothness parameters are selected BY REML/ML as if the sample size was $n/\gamma$ instead of $n$.

Solved – Difference between s() and ti() terms in mgcv package when applied to one variable

By default the tensor product constructor implementing a ti or te term will reparameterize the marginal smooths so that the coefficients are interpretable as values of the smooth at evenly spaced covariate values (see section 5.6.2. Wood, 2017, Generalized Additive Models: An introduction with R). This is done to improve the interpretation of the smoothing penalties. You can turn it off using ti(...,np=FALSE) -- see ?ti. The "cr" basis does not need to be re-parameterized, which is why you saw no difference in that case. Simon Wood

Best Answer

Q1

Q2

Related Solutions

Generalized Additive Model – Understanding GAM Smoother vs Parametric Term and Concurvity Difference

Solved – Difference between s() and ti() terms in mgcv package when applied to one variable

Related Question