Solved – where to specify covariates in a linear mixed effect model

lme4-nlmemixed modelr

I am using lme4 to create a mixed model for my data. I am looking at the effect of land cover on soil properties at three depths, for example, carbon concentration.

I have identified my fixed factors as land cover and soil depth and have a random effect of site. I also have tree biomass as a "random" effect which may affect the response, however, I am aware that random effects must be categorical therefore I should specify this as a covariate. I am unsure where I would put this in the R code though.

I've seen examples as so:

model <- lmer(carbon ~ land_cover + depth + biomass + (1 | site), 
              data = carbon, REML = FALSE).

and others which suggest it should be incorporated into the random effect as (biomass | site), possibly:

model <- lmer(carbon ~ land_cover + depth + (biomass | site), 
              data = carbon, REML = FALSE)

The first code to me seems as though this would specify biomass as a fixed factor, which it is not, however, the latter produces an error for me. I think this might be to do with missing data in my file though, so if I sort this out, would the second code be correct?

Best Answer

Random effects (cases where you want to allow for random variation among groups) are not exactly the same as nuisance variables (variables that are not of primary interest but need to be included in the model for statistical reasons). Your biomass variable is a nuisance variable, but it's a fixed rather than a random effect; your first model is correct.

Your second model (with (biomass|site) would allow for random variation in the effect of biomass across sites, and generally wouldn't make sense without having biomass as a fixed effect in addition (carbon ~ ... + biomass + (biomass|site)), as it's realistic to expect that there will be some non-zero effect of biomass at the population level. (In order for this to work you'd also need to have multiple observations with different biomasses at each site, i.e. biomass would have to vary within at least some of the sites.)


A short excerpt from a book chapter (from Fox, Negrete-Yankelevich Sosa, Vinicio J., Ecological Statistics: Contemporary theory and application, Oxford University Press 2015) where I discuss the idea that random effects may or may not be nuisance variables, and nuisance variables may or may not be random effects:

enter image description here

Related Question