Time Series Analysis – Fixed and Random Factor Analysis in R

anovartime series

I have a dataset which I am not sure how to analyse.

The dataset came from the following experiment: I grew plants (2 different types) and measured their height at different time point (each plant was followed individually). I had in total, 3 boxes in which I grew the plants and, in each box, 3 plants of each type. I took the measurement at 4 different time points.

So, if I am not wrong, the plant type is a fixed factor and the box a random one.

Here is how it is structured:

time, box, type, height
1, 1, 1, 1.2
1, 1, 1, 1.3
1, 1, 1, 1.1
1, 1, 2, 1.4 
1, 1, 2, 1.5
1, 1, 2, 1.6
...
2, 1, 1, 1.2
2, 1, 1, 1.3
...
1, 2, 1, 1.2
1, 2, 1, 1.3    

I would like to know what is the correct way to check if there is a difference between the different plant types using R

So far, what I have done is this:

lme1 <- lme(height ~ type, random= ~ 1|box, data=mydata)
anova(lme1)

but I do not know how to include the time variable in the analysis..

Here is the plot of the height evolution with time, for the different plant type. Each line is a plant.

enter image description here

Best Answer

For mixed-effects modells of time series I usually use the nlme package, because it offers facilities to model auto-correlation structures. If I don't need to consider auto-correlation, I prefer the lme4 package, which offers more flexibility for specifying random effects and is also usually faster.

I recommend reading Zuur et al. 2009 (ISBN 978-0-387-87457-9) as an introduction to mixed effects modelling with R. The book contains a lot of nice and illustrative examples using the nlme package.

From your question and comments the structure of your model should be as follows:

  • fixed effects: intercept, plant type (make sure that this is a factor variable and not a numeric in R!), time and the interaction between both
  • random effects: random intercept grouped by plant nested within box, possibly also a random slope vs. time with the same nesting structure
  • correlation structure: some kind of auto-regressive moving average correlation structure with time as a co-variate and the same grouping structure as the random effect

Thus, a full model could look like this: fit1 <- lme(height ~ type * time, random= ~ 1|box/plant, correlation=corARMA(0.2, form=~time|box/plant, p=1, q=0), data=mydata). You should test by comparing models using the anova function if including a random slope improves the model. You should also test, which auto-correlation structure is most appropriate (although Zuur et al. advise against spending too much effort on finding the optimal auto-correlation structure). Of course, you also need to study residual plots. Possibly you might need to specify a variance structure or transform the dependent if the model suffers from heteroskedasticity. Judging from the plot the relationship between height and time is pretty linear, but you could also try to transform time.

Potential problems: You have an extremely small number of time points, which could result in problems when trying to fit the auto-correlation parameter(s). It could even be impossible to fit them. Also, your number of individuals and boxes is very small.

Related Question