Solved – How to estimate variance components with lmer for models with random effects and compare them with lme results

anovalme4-nlmervariance

I performed an experiment where I raised different families coming from two different source populations. Each family was assigned one of two treatments. After the experiment I measured several traits on each individual.
To test for an effect of either treatment or source as well as their interaction, I used a linear mixed effect model with family as random factor, i.e.

lme(fixed=Trait~Treatment*Source,random=~1|Family,method="ML")

so far so good,
Now I have to calculate the relative variance components, i.e. the percentage of variation that is explained by either treatment or source as well as the interaction.

Without a random effect, I could easily use the sums of squares (SS) to calculate the variance explained by each factor. But for a mixed model (with ML estimation), there are no SS, hence I thought I could use Treatment and Source as random effects too to estimate the variance, i.e.

lme(fixed=Trait~1,random=~(Treatment*Source)|Family, method="REML")

However, in some cases, lme does not converge, hence I used lmer from the lme4 package:

lmer(Trait~1+(Treatment*Source|Family),data=DATA)

Where I extract the variances from the model using the summary function:

model<-lmer(Trait~1+(Treatment*Source|Family),data=regrexpdat)
results<-VarCorr(model)
variances<-results[,3]

I get the same values as with the VarCorr function. I use then these values to calculate the actual percentage of variation taking the sum as the total variation.

Where I am struggling is with the interpretation of the results from the initial lme model (with treatment and source as fixed effects) and the random model to estimate the variance components (with treatment and source as random effect). I find in most cases that the percentage of variance explained by each factor does not correspond to the significance of the fixed effect.

For example for the trait HD,
The initial lme suggests a tendency for the interaction as well as a significance for Treatment. Using a backward procedure, I find that Treatment has a close to significant tendency. However, estimating variance components, I find that Source has the highest variance, making up to 26.7% of the total variance.

The lme:

anova(lme(fixed=HD~as.factor(Treatment)*as.factor(Source),random=~1|as.factor(Family),method="ML",data=test),type="m")
                                      numDF denDF  F-value p-value
(Intercept)                                1   426 0.044523  0.8330
as.factor(Treatment)                       1   426 5.935189  0.0153
as.factor(Source)                          1    11 0.042662  0.8401
as.factor(Treatment):as.factor(Source)     1   426 3.754112  0.0533

And the lmer:

summary(lmer(HD~1+(as.factor(Treatment)*as.factor(Source)|Family),data=regrexpdat))
Linear mixed model fit by REML 
Formula: HD ~ 1 + (as.factor(Treatment) * as.factor(Source) | Family) 
   Data: regrexpdat 
    AIC    BIC logLik deviance REMLdev
 -103.5 -54.43  63.75   -132.5  -127.5
Random effects:
 Groups   Name                                      Variance  Std.Dev. Corr                 
 Family   (Intercept)                               0.0113276 0.106431                      
          as.factor(Treatment)                      0.0063710 0.079819  0.405               
          as.factor(Source)                         0.0235294 0.153393 -0.134 -0.157        
          as.factor(Treatment)L:as.factor(Source)   0.0076353 0.087380 -0.578 -0.589 -0.585 
 Residual                                           0.0394610 0.198648                      
Number of obs: 441, groups: Family, 13

Fixed effects:
            Estimate Std. Error t value
(Intercept) -0.02740    0.03237  -0.846

Hence my question is, is it correct what I am doing? Or should I use another way to estimate the amount of variance explained by each factor (i.e. Treatment, Source and their interaction). For example, would the effect sizes be a more appropriate way to go?

Best Answer

One common way to determine the relative contribution of each factor to a model is to remove the factor and compare the relative likelihood with something like a chi-squared test:

pchisq(logLik(model1) - logLik(model2), 1)

As the way that likelihoods are calculated between functions may be slightly different, I typically will only compare them between the same method.