Solved – Using a gamm4 model to predict estimates in new data

gamm4generalized-additive-modellme4-nlmer

I have been experimenting with gamm4 to derive GAMMs of some repeated measures data.

The models looks very nice and seem to give more flexibility than my LMMs.

Ultimately I want to compare models not by the quality of their fit (also the reality of comparing LMM and GAMM fits seems complex?), but in the quality of their predictions in new data sets, and in simulated new data by MCMC.

With LMMs I predict using the fixed effects only using:

mm <- model.matrix(terms(lmer),newdata)

newdata$predicted <- mm %*% fixef(lmer)

This is fine since we are predicting in new individuals, with new independent random effects.

I cannot get this predict method to work with gamm4.

> mm <- model.matrix(terms(gamm4$mer), newdata)

Error in model.frame.default(object, data, xlev = xlev) : 
  variable lengths differ (found for 'X')

I think this is because the GAM process creates new variables in order to transform predictor variables.
It is also complex, because I believe the transforms are stored as random effects, so I would need to extract these random effects, but not the “individual level” random effects.

Does anyone know how I can:

  1. Extract just the transform effect terms from the gamm4 model?
  2. Make predictions into new data using gamm4?
  3. Extract the model specification of the GAMM so I could implement it as a standalone algorithm?
  4. General advice?

Best Answer

  1. I'm not sure what you want here.

  2. Have you looked at ?predict.gam

    # Load the gamm4 package
    library(gamm4)
    
    # Using gamm4's built-in data simulation capabilities to give us some data:
    set.seed(100) 
    dat <- gamSim(6, n=100, scale=2)
    
    # Fitting a model and plotting it:
    mod <- gamm4(y~s(x0)+s(x1)+s(x2), data=dat, random = ~ (1|fac))
    plot(mod$gam, pages=1)
    
    # Generating some new data for which you'd like predictions:
    newdat <- data.frame(x0 = runif(100), x1 = runif(100), x2 = runif(100)) 
    
    # Getting predicted outcomes for new data
    # These include the splines but ignore other REs
    predictions = predict(mod$gam, newdata=newdat, se.fit = TRUE)
    
    # Consolidating new data and predictions
    newdat = cbind(newdat, predictions)
    
    # If you want CIs 
    newdat <- within(newdat, {
        lower = fit-1.96*se.fit
        upper = fit+1.96*se.fit
    })
    
    # Plot, for example, the predicted outcomes as a function of x1...
    library(ggplot2)
    egplot <- ggplot(newdat, aes(x=x1, y=fit)) + 
              geom_smooth() + geom_point()
    egplot
    
  3. See here for some possible assistance