Solved – Interpreting lsmeans values from a mixed model with offset

lme4-nlmelsmeansoffsetr

I am using a generalized Linear Mixed-Effects model to look at the effects of different treatments on a density of trichomes.

The model is :

fitPoisson = glmer(Count_trichomes ~ Treatment1*Treatment2*Treatment3 + 
                         (1 | Block/Code) + offset(log(Length)), family=poisson(), data=dataset)

Treatment 1 and 2 has 2 levels (0 and 1) and Treatment 3 has 3 levels (0,1,2). Block accounts for the replicates and Code, for each individual. Length is in cm.

An anova(fitPoisson) told me that treatments 1 and 3 are significant and that there is no interactions. What I want now is to know what the density is for each level of treatments.

So I used a lsmeans to look at the differences :

    > lsmeans(fitPoisson, ~ Treatment1)

     Treatment1   lsmean         SE df asymp.LCL asymp.UCL
     0           5.309106 0.06113705 NA  5.189280  5.428933
     1           5.471452 0.06114033 NA  5.351619  5.591285

     Results are averaged over the levels of: Treatment2, Treatment3
     Results are given on the log (not the response) scale. 
     Confidence level used: 0.95

I can see that the density of level 0 is lower than the density of level 1, but I dont understand what are the units used. It doesn't seems like it is for trichomes/cm, since the mean for level 0 is 107 trichomes/cm and the mean for level 1 is 131 trichomes/cm (calculated in excel).

When I transform back from the log scale, it gives me :

    > summary(lsmeans(fitPoisson, ~ Treatment1), type = "response")

     Treatment1   rate       SE df asymp.LCL asymp.UCL
     0           202.1694 12.36004 NA  179.3393  227.9058
     1           237.8053 14.53949 NA  210.9496  268.0799

    Results are averaged over the levels of: Treatment2, Treatment3 
    Confidence level used: 0.95 
    Intervals are back-transformed from the log scale 

Which is still far from the means I found in excel.

Maybe I just don't understand the information lsmeans is giving me, or I am not using the right function.

Best Answer

If you do

lsmeans(fitPoisson, ~ Treatment1 * Treatment2 * Treatment3)

you will see the predictions (on the log scale) made by the model for each combination of the three factors. The results you show in your question are the averages of these predictions, averaged (with equal weights) over the levels of Treatment2 and Treatment3. The results with type = "response" are the antilogs of these results, and the standard errors are obtained using the delta method. Note that the averaging is still done on the log scale, and the confidence intervals are computed on the log scale, then back-transformed.

If you want the averaging to be done on the raw count scale, that is possible too. Do:

rg = regrid(ref.grid(fitPoisson), transform = TRUE)
lsmeans(rg, ~ Treatment1)

(The regrid function creates a new reference grid for the model based on the back-transformed predictions. You can use summary(rg) to see the individual predictions.)

These results can differ markedly from the raw averages you obtained from the data when there is imbalance in the data, so that the raw averages give far from equal weights to the levels of those two factors.