First, let's try to simulate some data with a similar structure of your problem's data. My understanding from your description is that we have something like this:
Say we have 4 schools with 8 subjects in each school, thus a total of 32 subjects.
Each subject is posed with 3 tasks (one from each group), thus a total of 96 observations. As we have 6 different tasks, we will ask 16 times each task (6*16=96), so 32 tasks from each group (32*3=96).
n.schools <- 2
n.subject <- 20
n <- n.subject*3
subject <- rep(1:n.subject, 3)
school <- rep(1:n.schools, (n/n.schools))
age <- rnorm(n.subject, 30, 5)
age <- rep(age,3)
gender <- sample(c(0,1),n.subject, replace=TRUE)
gender <- rep(gender,3)
A <- c(rep(1,n/3),rep(0,2*n/3))
B <- c(rep(0,n/3),rep(1,n/3), rep(0, n/3))
C <- c(rep(0,2*n/3),rep(1,n/3))
group <- c(rep("A",n/3), rep("B",n/3),rep("C",n/3))
task <- c(rep("a",n/6),rep("b",n/6),rep("c",n/6),rep("d",n/6),rep("e",n/6),rep("f",n/6))
u.subject <- rnorm((n/3), 0, 1)
u.school <- rnorm(n.schools, 0 ,1)
lattent <- -20 + 0.5*age + 2*gender + 2*A -3*B -3.2*C + u.subject + u.school + rnorm(n,0,1)
pr <- 1/(1+exp(-lattent))
success <- rbinom(n, 1, pr)
Now let's try a GLMM model as the one you propose:
library(lme4)
> glmer(success ~ age + gender + group/task + (1 + group/task | school/subject), family=binomial)
Error: number of observations (=60) < number of random effects (=360)
for term (1 + group/task | subject:school);
the random-effects parameters are probably unidentifiable
So what does this mean? In general, you need your mixed effects model to be identifiable, and one of the conditions for this to happen is $\sum_{i=1}^N (n_i-k) >0$, where $k$ is the number of random effects. In the case of balanced data, this can be written equivalently $N n_1 > N k$ or $n_1 > k$. In other words you need to have less random parameters than the number of observations in each cluster/group, subject in your case. This cannot be the case if you add the group/task
term in the random part.
What group/task
or school/subject
means in the formula?
This is simply an equivalent way of writing group + group:task
so you actually have group
and its interaction with task
. So the last question doesn't actually makes sense. The same principle is used when you put this term as a grouping factor in the right part of (1 | ...)
which actually translates to (1| school) + (1 | school:subject)
hence the one factor nested within the other.
So, I suggest you read Chapter 2 from Bates book and pay extra attention to model specification. It is really important to define what actually makes sense to include as fixed or random effects and what the grouping factors will be. This depends on how you designed the study and what you want to extract from your models.
Best Answer
I think it may be a little more complex than just "fixed" or "random" effect. What you seem to be suggesting is that there is a known decline in bird abundance over the years. What is perhaps not known is whether that can be explained by values of existing variables in your regression. Ideally, you would include all the variables that could be influencing abundance and not the year, but it seems perhaps you have some unmeasured variables that are time-dependent.
If you include a coefficient for every possible year (treating it as a factor) this will lead to a fairly saturated model, and you may get biased coefficient estimates for other variables.
If you instead treat year as a random effect (i.e., for each year the effect is sampled randomly from a fixed Normal distribution) you are ignoring the requirement for random effects to be exchangeable, so that does not appear to be legitimate either.
If you instead include year as a linear predictor (i.e., have a single coefficient for the year, perhaps centred around the study midpoint year) you might run into problems if the actual effect of the unmeasured variables is non-linear. This could be checked by examining the prediction residuals versus the year covariate.
My advice would be to do the following:
Plot abundance (log transformed) versus year, to see what the overall structure looks like. If it seems to be linear then try adding year as a linear predictor (fixed effect) and examine the relationship between the residuals and year.
Run your model without year as a predictor and examine the relationship between the residuals from this model and year - if there is some form of structure then you need to account for it somehow.
Perhaps consider the use of fractional polynomials in your regression, as these can be quite flexible without increasing model complexity too dramatically. In this case you will need to rescale year so it is always positive but not too large.
Hope that is of some help...