This seems like a relatively basic question, but I can't find good pointers after two days of searching.
I am trying to fit a generalized linear mixed model to data obtained in an experiment. The data are repeated measures on subjects over two conditions (one main effect), and I am interested in measuring that main effect.
The data is the following:
subject <- factor(rep(c(1,2,3,4,5,6,7), each=2))
n <- rep(85, 14)
Ncorrect <- c(67, 70, 65, 80, 70, 76, 67, 62, 65, 71, 63, 75, 58, 71)
Out <- rep(c(0,1), 7) #condition encoded as Integer
OutF <- factor(Out) #condition encoded as logical
expData <- data.frame(subject, n, Ncorrect, Out, OutF)
I want to fit a mixed model to take into consideration the variation of the main effect (slope) between subjects. The problem I have is that I get different models fitted depending on whether I encode experimental conditions as factor vs encoding them as 0,1 integers.
fitInt <- glmer(cbind(Ncorrect, n-Ncorrect) ~ Out + (Out+0|subject),
data=expData, family=binomial) #Out is an integer
fitFac <- glmer(cbind(Ncorrect, n-Ncorrect) ~ OutF + (OutF+0|subject),
data=expData, family=binomial) #OutF is factor
First the similarity: they both return almost identical fixed effects (Out and Intercept), which is what I'm interested in.
Then, fitFac shows df=9, while fitInt shows df=11, and they differ in AIC and BIC. In this example fitInt seems more in line with the 3 parameters I want: (Intercept), Out (main effect), and the standard deviation of Out among subjects.
The outputs for the random factors for each fit are the following:
summary(fitInt)
Random effects:
Groups Name Variance Std.Dev.
subject Out 0.1466 0.3829
summary(fitLog)
Random effects:
Groups Name Variance Std.Dev. Corr
subject OutLFALSE 9.417e-08 0.0003069
OutLTRUE 1.466e-01 0.3828849 -0.96
It seems to me like fitLog is fitting a different random effect for each level of Out, and a correlation between those two? That would explain the Random effects summary and the differences in degrees of freedom.
As asked for in a comment, the Z matrices for both fits are the following:
getME(fitFac,"Z")
14 x 14 sparse Matrix of class "dgCMatrix"
[[ suppressing 14 column names ‘1’, ‘1’, ‘2’ ... ]]
1 1 . . . . . . . . . . . . .
2 . 1 . . . . . . . . . . . .
3 . . 1 . . . . . . . . . . .
4 . . . 1 . . . . . . . . . .
5 . . . . 1 . . . . . . . . .
6 . . . . . 1 . . . . . . . .
7 . . . . . . 1 . . . . . . .
8 . . . . . . . 1 . . . . . .
9 . . . . . . . . 1 . . . . .
10 . . . . . . . . . 1 . . . .
11 . . . . . . . . . . 1 . . .
12 . . . . . . . . . . . 1 . .
13 . . . . . . . . . . . . 1 .
14 . . . . . . . . . . . . . 1
getME(fitInt,"Z")
14 x 7 sparse Matrix of class "dgCMatrix"
1 2 3 4 5 6 7
1 . . . . . . .
2 1 . . . . . .
3 . . . . . . .
4 . 1 . . . . .
5 . . . . . . .
6 . . 1 . . . .
7 . . . . . . .
8 . . . 1 . . .
9 . . . . . . .
10 . . . . 1 . .
11 . . . . . . .
12 . . . . . 1 .
13 . . . . . . .
14 . . . . . . 1
Since I want just the random effect of subject on the main effect (Out), matrix 2 seems more in line with that.
The problem seems to be that I want one effect for Out, but when it's coded as a factor (even if binary), an effect for each level seems to be fitted, leading to problems with all the variance falling into one of the levels. So having Out as an integer seems to give me the model I want, but it also seems to be messier according to responses/comments and some previous experience of mine? This looks like a basic issue, maybe I'm missing something obvious I have somehow managed to do without so far.
The main effects I'm interested in don't change between the fits, but I have to apply this analysis to other data too, and I want to understand what's happening before publishing the results.
Thanks for any help.
Best Answer
What you observe is due to how the formula interface in R works when you give it a factor, and you exclude the intercept. That is, the formulas
~ Out + 0
and~ OutF + 0
are not equivalent. To see this, compare the output of themodel.matrix()
function that is used internally to construct the corresponding design matrices, i.e., comparewith
To make it equivalent you need to replace the second by
Hence, compare
with