Solved – gam models with random effect R

gamm4generalized-additive-modelpredictionrandom-effects-model

I am modeling fishery CPUE as a function of a number of a number of covariates using a GAM approach that includes fixed and random effects.

I understand that there are limitations with regards to predicting random effects (predict function only addresses fixed effects) with gamm4. How does the predict function in the basic gam (mgcv, using bs="re") deal with the random effects? Are they included in predictions? Any thoughts would be much appreciated…

Best Answer

Yes, they are included, but only ever for the observed levels of the random factor. You can turn this using the by variable smooth trick however.

Consider the following example taken from ?gam.models:

dat <- gamSim(1,n=400,scale=2) ## simulate 4 term additive truth

## Now add some random effects to the simulation. Response is 
## grouped into one of 20 groups by `fac' and each groups has a
## random effect added....
fac <- as.factor(sample(1:20,400,replace=TRUE))
dat$X <- model.matrix(~fac-1)                                  #$ rendering bug
b <- rnorm(20)*.5
dat$y <- dat$y + dat$X%*%b

rm1 <- gam(y ~ s(fac, bs="re") + s(x0) + s(x1) + s(x2) + s(x3),
           data = dat, method = "ML")

Now lets get the additive term contributions from the model and compare them with the full blown model predictions:

p <- predict(rm1, type = "terms")
head(rowSums(p) + attr(p, "constant"))
head(predict(rm1, type = "response"))

which gives

> head(rowSums(p) + attr(p, "constant"))
        1         2         3         4         5         6 
14.265260  6.433342  2.766193 12.864771  5.296381  7.341790 
> head(predict(rm1, type = "response"))
        1         2         3         4         5         6 
14.265260  6.433342  2.766193 12.864771  5.296381  7.341790

So we are convinced now that the two ways of generating the predicted values are equivalent. now look at p the additive term contributions to the fitted values:

> head(p)
       s(fac)      s(x0)     s(x1)      s(x2)        s(x3)
1 -0.03786017 -0.1683648  3.868927  2.6485134  0.157054343
2  0.21328630  0.5304765 -1.902366 -0.1972856 -0.007759325
3 -0.36501307  0.1058000 -1.661677 -3.0955348 -0.014372627
4 -0.12519987  0.5474540  2.554656  2.2189534 -0.128083342
5 -0.12519987 -0.3720668 -1.817144 -0.1451364 -0.041061989
6 -0.05481148  0.3490905 -1.216908  0.3411783  0.126251294

The first column is the s(fac) which was a random effect spline in the fitted GAM.

I will add that the gamm() function also in mgcv can give the within-group predictions (fitted values):

m2 <- gamm(y ~ s(x0) + s(x1) + s(x2) + s(x3),
           data = dat, method = "ML",
           random = list(fac = ~ 1))

head(predict(m2$lme))

> head(predict(m2$lme))
1/1/1/1/14  1/1/1/1/6 1/1/1/1/19  1/1/1/1/9  1/1/1/1/9 1/1/1/1/15 
 14.265259   6.433345   2.766196  12.864770   5.296383   7.341790 
> head(predict(rm1, type = "response"))
        1         2         3         4         5         6 
14.265260  6.433342  2.766193 12.864771  5.296381  7.341790