Solved – Difference between s() and ti() terms in mgcv package when applied to one variable

basis functiongeneralized-additive-modelmgcvrsplines

I am using the mgcv package in R to fit logistic GAMs to survey data. In one of my models I use an interaction between two covariates. I am currently trying to fit this model using RStan and so am constructing the basis functions myself using B-splines. However I am struggling to reproduce the basis functions from the model.matrix output of a GAM including a ti() smooth term of one variable. I illustrate the problem below:

#Set-up
library(mgcv)
set.seed(1)

#Generate random data to fit gam to
y <- rnorm(1000,0,1)
x <- sample(1:30,1000,replace=T) # integers between 1 and 30

#Fit gam with 1D s() smooth term with a B-spline basis (bs = "ps")
gam_s <- gam(y ~ s(x, bs = "ps", k = 6, m=c(2,1)))

#Extract model matrix
X_s <- model.matrix(gam_s)

#Fit gam with 1D ti() smooth term with a B-spline basis (bs = "ps")
gam_ti <- gam(y ~ ti(x, bs = "ps", k = 6, m=list(c(2,1))))

#Extract model matrix
X_ti <- model.matrix(gam_ti)

#Compute vector of indexes of unique elements of x for easy plotting
index <- numeric()
for (i in 1:30)
index[i] <- which(x==i)[1]

#Plot results
par(mfrow=c(1,2))
plot(0,0,xlim=c(1,30),ylim=range(X_s),xlab="x",ylab="B(x)",type="n",main="1D s() basis functions for a B-spline basis")
for (i in 2:6)
  lines(1:30,X_s[index,i],col=i-1)
plot(0,0,xlim=c(1,30),ylim=range(X_ti),xlab="x",ylab="B(x)",type="n",main="1D ti() basis functions for a B-spline basis")
for (i in 2:6)
  lines(1:30,X_ti[index,i],col=i-1)

#Do the same for a cubic regression spline basis (bs = "cr")
gam_s <- gam(y ~ s(x, bs = "cr", k = 6))
X_s <- model.matrix(gam_s)
gam_ti <- gam(y ~ ti(x, bs = "cr", k = 6))
X_ti <- model.matrix(gam_ti)
plot(0,0,xlim=c(1,30),ylim=range(X_s),xlab="x",ylab="B(x)",type="n",main="1D s() basis functions for a cubic regression spline basis")
for (i in 2:6)
  lines(1:30,X_s[index,i],col=i-1)
plot(0,0,xlim=c(1,30),ylim=range(X_ti),xlab="x",ylab="B(x)",type="n",main="1D ti() basis functions for a cubic regression spline basis")
for (i in 2:6)
  lines(1:30,X_ti[index,i],col=i-1)

Running this R code, you should see that the first two plots using the B-spline basis and s()/ti() are different whereas the next two plots using the cubic regression spline basis and s()/ti() are identical. My question is why should the basis make a difference? For the B-spline basis I have worked out how to derive the s() plot – the original basis is transformed using a QR decomposition in order to satisfy the sum-to-zero constraint. But the ti() plot doesn't look like anything you could get by transforming a B-spline basis. The 1D basis functions are important because I use the tensor product of them to model the interaction. Any help would be greatly appreciated!

Best Answer

By default the tensor product constructor implementing a ti or te term will reparameterize the marginal smooths so that the coefficients are interpretable as values of the smooth at evenly spaced covariate values (see section 5.6.2. Wood, 2017, Generalized Additive Models: An introduction with R). This is done to improve the interpretation of the smoothing penalties. You can turn it off using ti(...,np=FALSE) -- see ?ti. The "cr" basis does not need to be re-parameterized, which is why you saw no difference in that case. Simon Wood

Best Answer

Related Solutions

Solved – GAM vs LOESS vs splines

Solved – Smoothing methods for gam in mgcv package

Related Question