Solved – Mixed-effect model / ANCOVA with lmer in R

ancovalme4-nlmemixed modelpredictorr

I have a question according the following example:

What I want to find out is whether two fertilizers (A and B) have different effects on the biomass of my plants. My explanatory variable is 'fertilizer' (categorial) with the levels 'A' and 'B'. My response variable is 'biomass' (continuous). Besides that, I have two other continuous variables 'seed mass' of the plants (measured before they germinated) and 'growth duration' (=harvesting date minus germination date). These two variables are expected explain some variance in my data and therefore I want to include them as covariates. (For each of the two levels I have 30 plants without any missing values.)

I read that my factor is a fixed effect and the covariates are random effects, as they represent a random sample out of the natural population. So I would do an ANCOVA in R using the lmer function (package lmerTest) like this:

model <- lmer(biomass~fertilizer+(1|seed_mass)+(1|growth_duration), data=dataset)
anova(model)

My question – are my considerations and the way I'm performing the analysis correct? Or is this analysis not appropriate for my question? My special concern is about the covariates, if I should include them in a different way in the model.

Best Answer

No, your model is certainly wrong. A continuous variable can't be the grouping factor of a random effect. You need a random effect if the assumption of independence of residuals is violated. Typically that's the case if you have measured the same experimental unit repeatedly, i.e., if you have "repeated measures". In your example that might be the case if you had several field plots or pots or plants for which you measured biomass repeatedly. E.g., if that's the case, a valid model might be lmer(biomass ~ fertilizer + seed_mass + growth_duration + (1 | plotID), data=dataset). Here a random intercept would account for plants from the same plot being more similar to each other than plants from different plots if these similarities are not explained by the fixed effects, but by unknown ("random") experimental effects.

I would recommend reading Zuur et al. if you are interested in learning about mixed effects models. The book is at the beginner level and contains some nice practical examples from ecology.

If you don't have repeated measures you don't need a mixed effects model and can simply use lm, such as lm(biomass ~ fertilizer + seed_mass + growth_duration, data=dataset). You would probably also have to consider interactions and of course need to do the usual regression diagnostics regarding homoscedasticity, multicollinearity, influential values etc.

Related Question