Solved – How to code random slopes random intercept model with quadratic term and covariance unstructured in lme4

lme4-nlmemixed modelrtime series

I've worked through quite some documentation about both the nlme and lme4 package, (and quite some fora) but am still unsure whether I'm doing the right thing. Here's what I want to do:
I want to build a linear mixed model (on daily diary data: 5x a day, 28 days, so 140 measurements for 110 persons in two treatment groups) as follows:
Affect(dep.var) = Group + Time + Time^2 + group * time + group * time^2 + random.interceptsubjects + random.slopetime

My question is both theoretical and practical: 1. should I separately code random slopes for time^2, or is that nonsense, and if so, 2. is the following the correct way;

model <- lmer(affect ~ group + day + I(day^2) + group * day + group * I(day^2) + 
                      (day||ID) + (I(day^2)||ID),
                      data = my_data, REML = TRUE)

note: I added the double || in an attempt to get the covariance structure 'unstructured', as an enlightened mind in our department said I should.
If it is correct, if I run it, it says that the model is nearly unidentifiable: very large eigenvalue. lme4 suggests rescale variable.
3. is it a serious issue in the first place, or can I more or less ignore it,
4. if I have to rescale, how does one go about? It has to do with the quadratic term, I know that, but any suggestions as to how I should then rescale are more than welcome…

Best Answer

Raw polynomials of your day variable are simply powers of that variable. For example, day (which is day raised to power 1) and day squared (which is day raised to power 2) are raw polynomials. In your model, these raw polynomials are represented by the terms day and I(day^2). The problem with introducing multiple raw polynomials in your model is that they can be highly correlated with each other, thereby creating collinearity problems for the model. Orthogonal polynomials are preferred to raw polynomials as they combine to produce the same fit to the data (e.g., quadratic) for each person, while averting collinearity problems. This code - for one person - will give you more insights:

day <- 1:28  # day predictor for one person 

raw <- poly(day, 2, raw=TRUE) # raw contains day and day^2 for one person

raw

orthogonal <- poly(day, 2, raw=FALSE) # orthogonal versions of day and day^2 

orthogonal

options(scipen=999)

cor(raw) # high correlation between day and day^2

cor(orthogonal) # no correlation between orthogonal versions of day and day^2

See also https://stackoverflow.com/questions/19484053/what-does-the-r-function-poly-really-do.

You can fit models like this for raw polynomials:

model <- lmer(affect ~ poly(day,2, raw=TRUE)*group + 
                       (1 + poly(day,2, raw=TRUE)|ID),
                       data = my_data, REML = TRUE)

and like this for orthogonal polynomials:

model <- lmer(affect ~ poly(day,2, raw=FALSE)*group + 
                       (1 + poly(day,2, raw=FALSE)|ID),
                       data = my_data, REML = TRUE)

If you need to include random slopes just for the linear but not for the quadratic effect of age in your latter model, best to add your orthogonal polynomials to your dataset and give them appropriate names:

my_data$age.orth.1 <- poly(my_data$day, 2, raw=FALSE)[,1] # orthogonal version of day

my_data$age.orth.2 <- poly(my_data$day, 2, raw=FALSE)[,2] # orthogonal version of day^2

Then:

model <- lmer(affect ~ age.orth.1*group + age.orth.2*group + 
                       (1 + age.orth.1|ID),
                       data = my_data, REML = TRUE)

In principle, both the slope for age.orth.1 and age.orth.2 can vary across the persons represented by the ones included in your study. However, depending on your data, your model may or may not support the inclusion of two random slopes - one for age.orth.1 and the other for age.orth.2. If the data support only the inclusion of a random slope for age.orth.1, then you don't have much choice.

Related Question