I'm trying to use this method for calculating the Information Coefficient using bootstrapping. The advantage of using bootstrapping is that I can compare models that are not nested. But to do this, I need to be able to calculating the likelihood of out-of-sample data (because I'm bootstrapping).
I have tried several different methods, which give me wildly different results. This is easiest to illustrate when calculating the log-likelihood for the in-sample data. The easiest option is to use logLik
:
data(Orthodont,package="MEMSS")
mod<-lmer(distance~age+(1+age|Subject), data=Orthodont)
logLik(mod)
> -221.3183.
But I get a different result using the residuals:
resid<-residuals(mod)
sum(dnorm(resid,sd=sd(resid),log=TRUE))
> -162.1903
I also tried using the residual variance given by lmer:
sum(dnorm(resid,sd=sigma(mod),log=TRUE))
> -165.5434
I know that log-likelihood is sometimes calculated by integrating over values for the parameters, whereas by using residuals, I am conditioning on the point-estimates for the parameters. However, according to the help for logLik.merMod
, logLik
returns "log-likelihood at the fitted value of the parameters." I think that means they are conditioning on the point-estimates.
Just to be sure, I tried estimating the unconditioned log-likelihood. By using predict with re.form=NA
, you can retrieve the fitted values based on fixed effects only (ignoring random effects).
resid<-Orthodont$distance-predict(mod,newdata=Orthodont,re.form=NA)
sum(dnorm(resid,sd=sd(resid),log=TRUE))
> -252.7908
Interestingly, all of the above methods give roughly the same answer when using glm
. So this seems to be specific to mixed effects models.
Best Answer
It seems that calculating log-likelihood for mixed effects models requires dealing with the covariance of error terms for the random effects. Here is a method for calculating log-likelihood by hand for both ML and REML:
To calculated the likelihood for a new datapoint (or, more accurately, calculate the density for that new datapoint) using ML, given X and Y for the new subjects, compute V based on G and then calculate:
It's not clear this can be done using REML, since the - 1/2 * log(det(t(X) %% W %% X)) term cannot be decomposed into the contribution of each individual subject.
Many thanks to several experts who answered questions via email.