Random Effects Model – Does it Make Sense to Include a Factor as Both Fixed and Random in Linear Mixed Effects Model?

lme4-nlmerrandom-effects-model

I know this sounds contradictory, if we think of main effects, as the random factor would remove some of the variation that we want to explain by the fixed effect.

But let’s look at a study design where fields were paired (according to land use in the surroundings, distance etc.). In each field pair one was randomly assigned to be sown with an insecticide while the other one served as control. In each field several bee colonies were placed and a response variable (e.g. the weight) was measured.

# Example data: 

set.seed(123)

colony = as.factor(1:96)
colony_effect = rnorm(96, mean = 2)

field = as.factor(sort(rep(c(1:16),6)))
field_e = rnorm(16, mean = 2)
field_effect = rep(field_e, each = 6)

field_pair = as.factor(sort(rep(c(1:8),12)))
field_pair_e = rnorm(8, mean = 2)
field_pair_effect = rep(field_pair_e, each = 12)

treatment = as.factor(rep(c(rep("control", 6), rep("treat", 6)), 8))
treatment_effect = rep(c(rep(0, 6), rep(1, 6)), 8)     

response = treatment_effect + field_effect + field_pair_effect + colony_effect
df1 = data.frame(treatment, field_pair, field, colony, response)

Intuitively, I would specify the model like this, if I want to test for a treatment effect on the response:

library(lme4)
lmm2 = lmer(response ~ treatment + (1|field_pair/field), data = df1)
summary(lmm2)
anova(lmm2, update(lmm2,.~1 + (1|field_pair/field)), test = "LRT") # p = 0.008118

I think my colleagues used the same model only that they specified it slightly differently:

lmm2_2 = lmer(response ~ treatment + (1|field_pair) + (1|field_pair:treatment), data = df1)
summary(lmm2_2)
anova(lmm2_2, update(lmm2_2,.~1 + (1|field_pair) + (1|field_pair:treatment)), test = "LRT") #  p = 0.008118

However, an answer in this thread on a question on the choice of random factors, made me wonder whether or not field pair should be included in the model. The asker wanted to explain the fish growth of two fish populations that were placed in several tanks and exposed to two temperature regimes (crossed fixed effects: population type and temperature).

@Aron answered:

”It doesn't make sense to both include tank as a random effect and
nest tank within the pop/temp fixed effect. You only need one of
these, depending on how tank is coded.”

And:

“Including the tank random effect is only desired if the tanks were
first divided into two groups and then randomized to treatment; if the
eight tanks were completely randomized to treatment, this is not
necessary.”

This made me wonder whether the random structure that my colleagues used (and that I intend to use) is correct, since

Treatment is both fixed and random effect
Field pair was included rather than just field (which I believe would
be analogous to tank in the fish example, if the tanks were grouped
beforehand)

For comaparison, model without field pair:
```
lmm1 = lmer(response ~ treatment + (1|field), data = df1)
summary(lmm1)
anova(lmm1, update(lmm1,.~1 + (1|field)), test = "LRT") # p = 0.0126
```

Best Answer

To answer to the title of your question: yes, it can make sense to include a factorial variable as fixed and a random effect. Depending on the data structure/ experimental design this may even be necessary to do so to arrive at valid conclusions. For a very readable discussion of the topic see: Barr DJ, Levy R, Scheepers C, Tily HJ. Random effects structure for confirmatory hypothesis testing: Keep it maximal. J Mem Lang 2013;68(3):255–78.

There is a similar example with an answer by @Ben Bolker, who also recommends to use one variable (organ) as fixed and mixed: Mixed Effects Model with Nesting.

I could not quite wrap my head around the answer to your particular experiment, yet. The question is, how many random effects are "enough" to account for all dependencies introduced by your experimental design. Your lmm2 looks fine to me. Intuitively, I would have specified the model with a random slope for treatment within field_pairs and ignore fields alltogether, like:

lmm3 = lmer(response ~ treatment + (1+treatment|field_pair), data = df1)
summary(lmm3)
anova(lmm3, update(lmm3,.~1 + (1+treatment|field_pair)), test = "LRT") # p=.00862

This gives very similar results to your lmm2, but yields a slightly lower AIC in your sample data. I would be glad to hear other opinions on this and whether these approaches are equivalent.

Related Solutions

Solved – Random effect nested under fixed effect model in R

It doesn't make sense to both include tank as a random effect and nest tank within the pop/temp fixed effect. You only need one of these, depending on how tank is coded.

If tank is coded 1-8, you only need the tank random effect. Nesting it within the pop/temp fixed effect results in the same 8 units, so is not necessary.

If tank is coded 1-2 (that is, which rep it was), you only need to nest tank within the pop/temp fixed effect, because that gives you your 8 unique tanks. Including the tank random effect is only desired if the tanks were first divided into two groups and then randomized to treatment; if the eight tanks were completely randomized to treatment, this is not necessary.

You could do this with likelihood based solutions such those in nlme and lme4 but if everything is balanced, it might be simpler to use the traditional ANOVA approach using aov.

Creating some sample data:

set.seed(5)
d <- within(expand.grid(pop=factor(c("A","B")),
                        temp=factor(c("warm", "cold")),
                        rep=1:2,
                        fish=1:100), {
                          tank <- factor(paste(pop, temp, rep, sep="."))
                          tanke <- round(rnorm(nlevels(tank))[unclass(tank)],1)
                          e <- round(rnorm(length(pop)),1)
                          m <- 10 + 2*as.numeric(pop)*as.numeric(temp)
                          growth <- m + tanke + e
                        })

Using aov like this:

a0 <- aov(growth ~ pop*temp + Error(tank), data=d)
summary(a0)

or lme like this:

library(nlme)
m1 <- lme(growth ~ pop*temp, random=~1|tank, data=d)
anova(m1)

Solved – Random factor nested in two fixed factors

aov and lme are not the same functions. aov is a wrapper for lm, the general linear model function. For balanced designs and one or two random levels, aov can cope with a random term by fitting linear models to different error strata (according to the help files). Significance testing is done using appropriate F statistics.

lme, on the other hand, fits a genuine mixed effects model using maximum likelihood (or REML) to obtain estimates of the parameters of the model. They are not required to give the same estimates.

It is also possible that your formulae are miss-specified, but I can't tell from the information you have supplied in the question. I don't see where there are two levels in the model. You have a 2x2 factorial with two replicates (the tanks) at each treatment level. The "tank effect" is simply the residual error -- unless you are taking several measurements from each tank.

Later

I simulated some data based on the OP's problem: a 2x2 factorial with 2 replicates at each level (the tank) and 9 measurements from each tank. There are 3 ways to analyse these data

Use aov and the formula given by the OP
Use lme (resp lmer) and the formula given by the OP
Take the mean survivorship within each tank and use oav() on the means.

All three methods lead to the same conclusions. The first and the third produce numerically identical F statistics (as they should, given the problem is balanced).

aov and lme give different output. To see the coefficients for pH and temp you need to do coef() on the output of the aov() function. I got identical estimates. The p-values are slightly different, as one would expect given a different model, different estimation and different test statistic.

Notice that you get 7 degrees of freedom in the aov() model. The treatments are randomized to the 8 tanks, so basically, the sample size is 8. Taking repeated measurements in each tank gives a more precise measurement of whatever is going on inside each tank -- which is good -- but in a sense, this is still an experiment in which n=8.

Best Answer

Related Solutions

Solved – Random effect nested under fixed effect model in R

Solved – Random factor nested in two fixed factors

Related Question