Solved – Comparing lsmeans of levels from different factors

lme4-nlmelsmeansr

I have a glmer model that I am performing contrasts on. My model has four factors (emotion, gender, filter, ageGroup). No interactions. I have been using lsmeans(x, spec, contr='pairwise')) (and) lsm within glht) to get lsmeans and to do pairwise contrasts within-factors. I am also interested in comparing the means from one factor against a level from a different factor.

> summary(m7.only_within)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
 Family: binomial  ( logit )

 Formula: FaceStimulus.RESP.binary ~ emotion + emotion.Sample + gender +  
    gender.Sample + ageGroup + ageGroup.Sample + filter + filter.Sample + (1 | Subject)

Fixed effects:
                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)                 -0.823451   0.264442  -3.114  0.00185 ** 
emotiondisgust               1.504896   0.092532  16.264  < 2e-16 ***
emotionfear                  1.555492   0.091832  16.938  < 2e-16 ***
emotion.Sample              -0.106948   0.044364  -2.411  0.01592 *  
gendermale                   0.012718   0.074198   0.171  0.86391    
gender.Sample                0.007021   0.010046   0.699  0.48462    
ageGroupold                  0.364239   0.076379   4.769 1.85e-06 ***
ageGroup.Sample             -0.013519   0.009577  -1.412  0.15808    
filtercropAdaptiveThreshold  0.419095   0.075906   5.521 3.37e-08 ***
filter.Sample                0.070147   0.029658   2.365  0.01802 *  
---

I understand that lsmeans calculates standard errors using variable variances and covariances from vcov() to do pairwise comparisons. Can I just compare one model variable against another? Like so:

summary(glht(m3.only_within, linfct = c("filtercropAdaptiveThreshold - ageGroupold = 0")))

Linear Hypotheses:
                                               Estimate Std. Error z value Pr(>|z|)    
filtercropAdaptiveThreshold - ageGroupold == 0  0.05486    0.10798   0.508        1    

That contrast is the test of the difference between the betas in the model, not between the lsmeans. Naturally, all my other contrasts compare the lsmeans since I'm using lsm(). Can I just manually take the difference between the lsmeans, and then use the variances and covariance SE's from the model vcov for the two model effects to get the SE for the lsmeans difference? Naturally, those SEs are much lower than the SEs reported from the lsmeans output.

$lsmeans
 ageGroup     lsmean        SE df  asymp.LCL asymp.UCL
 young    0.01043529 0.2466025 NA -0.4728967 0.4937673
 old      0.37467465 0.2474967 NA -0.1104101 0.8597594

$lsmeans
 filter                     lsmean        SE df   asymp.LCL asymp.UCL
 cropDesat37           -0.01699274 0.2462214 NA -0.49957782 0.4655923
 cropAdaptiveThreshold  0.40210268 0.2478032 NA -0.08358268 0.8877880

So it would be:
$$
\displaystyle \begin{array}{l}
\bar{X_1}-\bar{X_2}= lsmean\_of\_cropAdaptiveThreshold – lsmean\_of\_old \\
= 0.40210268 – 0.37467465 = 0.0274\\ \\
SE_{\bar{X_1}-\bar{X_2}}= \sqrt{SE_1^2 + SE_2^2 + 2*cov}\\
= \sqrt{0.075906^2 + 0.076379^2 + 2*-3.225794e-05}\\
= 0.10798
\end{array}
$$

…which is the same SE as the SE in the beta-contrast.

Is it legal to compare cross-factor level lsmeans like this?

Edit 2021-07-26: I think what I probably conceptually wanted to do at the time I wrote this was compare the relative magnitude of the estimates. I'm pretty sure I could have done that by doing t-tests with the standards and their standard errors, doing some hand-waiving due to the repeated measures nature of the data. I certainly did not write this question that way though.

Best Answer

Let $\hat y_{ij}$ denote the prediction from your model at the $i$th level of ageGroup $(i=1,2)$ and the $j$th level of filter $(j=1,2)$. Then the LS means you seem to want to compare are $\hat y_{2\cdot} = (\hat y_{21}+\hat y_{22})/2$ and $\hat y_{\cdot 2} = (\hat y_{12}+\hat y_{22})/2$. Consequently, $$ \hat y_{\cdot 2} - \hat y_{2\cdot} = [(\hat y_{12}+\hat y_{22})/2] - [(\hat y_{21}+\hat y_{22})/2] = (\hat y_{12} - \hat y_{21})/2 $$ So what you think is a comparison of two marginal LS means is really one half of a diagonal comparison of two cell means. You can get it from lsmeans using code like this:

library(lsmeans)
m3.lsm = lsmeans(m3.only_within, c("ageGroup", "filter"))
contrast(m3.lsm, list(diag.comp = c(0, -1/2, 1/2, 0)))

I still, however, think you are going to confuse other people, if not yourself, by calling this a comparison of marginal LS means of different factors.