Mixed Effects Model – Impact of Including Year as Categorical Random Effect on Long-Term Trends

mixed modelrtest-for-trendtime series

I am trying to detect evidence of warming in a monthly temperature time series over a 20-year period by testing for a trend. I have precisely followed the method of Crawley (2013) The R Book, 2nd Edition pgs 798-799. In his linear mixed effects model for monthly temperatures he treats the explanatory variables time and linear trend as fixed effects, and year as a categorical random effect allowing for different intercepts for the different years. He then uses ANOVA to compare the full model (with trend explanatory variable) with a reduced version (i.e. without the trend explanatory variable).

A reviewer has questioned why year has been treated as a random effect and suggested that by doing so this would essentially remove a long-term trend. Can anyone clarify why it is correct to include year as a random effect and if by doing so this does or does not remove a trend?

Best Answer

Including by-Year random intercepts won't remove a long-term linear trend, but it may capture other non-linear trends not captured by the fixed-effects. See for this great explanation from Thierry Onkelinx discussing this problem, when Year has a quadratic trend. (One important difference: the time and trend fixed effects in your problem are handled by Year in the fixed-effects.)

There is the deeper question though about whether a model structure involving multiple terms for time (time, Year, trend) makes sense. I certainly can think of situations where this makes sense (e.g. student effort over the course of an academic year, where time represents the time within the year, trend models long-term trends over generations of students, and Year some other aspect of the sampling, perhaps the precise year within a generation of students), but it you could easily stumble onto a collinear or nonsensical model structure when you have multiple model parameters for closely related measurement variables.

Related Solutions

Solved – How to formulate linear mixed model to find out effects of continuous variables

An idea for improvement of the marginal $R^2$ calculation used would be to assess this with the other predictors included in the model. As it stands here, the marginal $R^2$ calculation only takes into account one predictor at a time.

An alternative is to fit two models. One model contains all predictors, the other has one predictor dropped. The models can then be compared to see the decrease in marginal $R^2$ that is due to removal of the predictor. For instance:

m1 <- lmer(resp ~ pred1 + pred2 + pred3 + (1|weeks) + (1|Sample), data = Xs)
m2 <- lmer(resp ~ pred2 + pred3 + (1|weeks) + (1|Sample), data = Xs)

r.squaredGLMM(m1)[[1]]-r.squaredGLMM(m2)[[1]]

This tells you that the marginal $R^2$ drops quite a bit by simply removing the first predictor. This echos your approach, but has the added benefit of including all relevant predictors in the model that is used to calculate the goodness of fit.

With regards to building a suitable model, why have you removed the intercept? This is a key piece of information. When you do that you are forcing the model to pass through the origin. Specifically, you are enforcing that when the predictors take on values of 0 the predicted response must be 0. I am suspecting that this is probably not what you want.

Since you said that you are interested in the relative effects of the predictors, standardizing your predictors as you have done is a good idea.

An alternative is to fit a model with scaled predictors such as this:

m3 <- lmer(resp ~ pred1 + pred2 + pred3 + (1|weeks) + (1|Sample), data = X)

Since you standardized the predictors the estimated $\beta$s represent the relative effect of the predictors on the outcome $resp$.

To test whether these relationships are likely to be true not only in the sample, but also in the population, a sensible approach is to conduct model comparisons such as likelihood ratio tests, AIC or BIC.

The way this is done is to remove predictors in a stepwise manner and compare the two models with your comparison method of choice. If such a comparison reveals that a predictor does not significantly contribute to an improvement in overall fit, then you can remove this predictor from your model and also consider reporting that there appears to be no relationship between that predictor and your outcome. Lots of info on this site for doing model comparison.

Solved – Autocorrelation across random effects in nlme:lme

You're fitting both a random and fixed effect for the year variable. This is not very likely. Chances are that either fixed effect is measured near zero or that std of random effect is near zero. Than use one of these formulas (which i suspect you actually want):

Measuring continuous yearly trend:

lme(Response ~ autor + Month, method="REML", random = ~ 1|Site, correlation=corAR1(value=0.1,form=~autor|Site), data=data1)

Measuring a separate response for each year:

lme(Response ~ Year + Month, method="REML", random = ~ 1|Site, correlation=corAR1(value=0.1,form=~autor|Site), data=data1)

Since Year will not be a random effect problem should be solved.
Not in R as much as I know. (lme4 - supports crossed random effects but not AR, nlme - does not support crossed random-efffects). You could use SAS/SPSS. But first you could try fitting results with lme4 and see whether there's any autocorrelation. You try could and use plm (http://cran.r-project.org/web/packages/plm/index.html), I don't have any experience with this package myself... BTW - I don't see any crossed random effects in your data...

Best Answer

Related Solutions

Solved – How to formulate linear mixed model to find out effects of continuous variables

Solved – Autocorrelation *across* random effects in nlme:lme

Related Question

Solved – Autocorrelation across random effects in nlme:lme