Solved – Lsmeans output for clmm models (R)

confidence intervallogisticlsmeansordinal-datar

I've fitted a cumulative logit model, where the IV is categorical (different animation models being compared), the DV is ordinal (points on a 1-9 scale), and there are some random effects (subject, sentence). The animation models are rated across a bunch of sentences by every subject. I've largely been following this tutorial: http://rcompanion.org/handbook/G_07.html

First I turn the variables into unordered and ordered factors:

d\$score.f = factor(d\$score, ordered=TRUE)

d\$model.f = factor(d\$model, ordered=FALSE)

m = clmm(score.f ~ model.f + (1+model.f|sentence) + (1+model.f|subject), data = d)

Anova(m, type="II")

I got a significant p-value, so I then do the post-hoc test with lsmeans and Tukey: lsmeans(m, pairwise ~ model.f, adjust = "tukey")

The results are:

 model.f    lsmean        SE df  asymp.LCL asymp.UCL
 0       0.2503735 0.4297950 NA -0.5920091  1.092756
 1       0.8620336 0.3462029 NA  0.1834883  1.540579
 2       1.4808482 0.4786811 NA  0.5426504  2.419046

Confidence level used: 0.95 

$contrasts
 contrast   estimate        SE df z.ratio p.value
 0 - 1    -0.6116601 0.2591413 NA  -2.360  0.0479
 0 - 2    -1.2304747 0.2807095 NA  -4.383  <.0001
 1 - 2    -0.6188146 0.3124234 NA  -1.981  0.1170

I found through StackExchange that I could also do this to get confidence intervals: lsmeans(m, pairwise ~ model.f, adjust = "tukey", mode="mean.class")

It gives the following results:

 model.f mean.class        SE df asymp.LCL asymp.UCL
 0         5.465029 0.3372352 NA  4.804061  6.125998
 1         5.939538 0.2476488 NA  5.454156  6.424921
 2         6.387112 0.3248218 NA  5.750473  7.023751

Confidence level used: 0.95 

$contrasts
 contrast   estimate        SE df z.ratio p.value
 0 - 1    -0.4745091 0.2065156 NA  -2.298  0.0561
 0 - 2    -0.9220829 0.2018481 NA  -4.568  <.0001
 1 - 2    -0.4475738 0.2158877 NA  -2.073  0.0954

P value adjustment: tukey method for comparing a family of 3 estimates

Now, to the actual question: The tutorial I followed was very clear on the fact that you should ignore the lsmean, SE, LCL, and UCL values of the lsmeans output. It would be nice to have some way to present the results other than just that there was a significant difference between 0 and 2 (and 0 and 1 in the first lsmeans output – why did the values change?), like a plot showing the least square means with the confidence intervals. One page on the tutorial website (http://rcompanion.org/handbook/G_06.html) did just that, but only with a non-clm model. Can you make such a plot with a clmm-model, like using the output from the second lsmeans call here with the mean.class and SE?

If not, what options do I have… I know how to calculate the probabilities for the different rating grades for the different models using the coefficients from the model (summary(m)), and could present it as one bar plot for each model or something, and the coefficients have std. error associated with them, but would calculating with these std. errors subtracted and added give the lowest ad highest end of the confidence intervals? Or calculate the expected value using the probabilities?

By the way, the data looks like this (so ten sentences, around 12 subjects, three models and a 1-9 rating scale):

subject,sentence,model,score
0,0,0,5
0,0,1,6
0,0,2,7
0,1,0,7
0,1,1,5
0,1,2,8
0,2,0,5
0,2,1,6
0,2,2,7

Best Answer

First, you should switch to the emmeans package, which is the continuation of lsmeans; the latter will be archived on CRAN soon.

Regarding the instruction to ignore certain results. I invite you to ask yourself what kind of statistical software would output results that should be ignored. The more likely possibility is that the writer of that advice does not understand the output so doesn't want to try to explain it, or disagrees with that method of analysis for an unstated reason (but why then show the methods at all?).

The default results of lsmeans() (or emmeans::emmeans()) are on a latent-variable scale; that latent model asserts that there is a continuous but unobservable response having a logistic distribution with a mean that depends on the predictors, and that there is also a set of cut points that define a set of intervals on the latent scale. In our data, the ordinal response corresponds to which interval each latent response falls in. This latent scale is not completely identifible, so any linear function of that scale is equally valid. The lsmeans() output picks one version of that, and its results are meaningful relative to that scale.

Moreover, the results with mode = "mean.class" are completely interpretable on an absolute scale. Regarding the ordinal values as 1, 2, ..., the mean.scale predictions comprise the mean of the ordinal variable, based on the predicted probabilities of its ordinal distribution.

For an example detailing the analysis of a clm model, see the relevant emmeans vignette: https://cran.r-project.org/web/packages/emmeans/vignettes/sophisticated.html#ordinal. Documentation for the different options available for ordinal models is provided in https://cran.r-project.org/web/packages/emmeans/vignettes/models.html#O. There is also an index of vignette topics where you may find yet more information.

Related Solutions

Solved – Why CLMM function for ordinal mixed logistic regression changes the means

The default output from lsmeans is on the latent-variable scale -- a bit hard to explain but one way to think of it is that the common model involves a linear predictor for the logit of the cumulative probabilities, and the latent value is the average of that linear prediction of each grid value across cut points.

If you want the predicted average class number on the 1-9 scale, it's easy to get:

lsmeans(..., mode = "mean.class")

For more details, see ? models with lsmeans loaded.

Confidence Interval – Why Aren’t These Sentences About Confidence Intervals Equivalent?

The events in statements 2 and 3 are obviously equivalent – I interpret them as $CI_D \ni \theta$ and $\theta \in CI_D$ respectively. The issue here is that you are vague about whether you are talking about CIs as random intervals or as fixed intervals after the observed data has been substituted, and you are also vague about whether you are talking about conditional or unconditional probability. Below I will show which mathematical statements about confidence intervals are true/false. So long as you describe these statements correctly in a textual sense (which requires more explicit specification of some issues you're glossing over) you should be fine.

Probabilistic properties of the CI: I'll conduct a purely probabilistic analysis of confidence intervals as mathematical objects, so I'll examine probability statements applying to these objects that are both conditional and unconditional on $\theta$. Note that in the classical framework, the parameter is treated as an "unknown constant" so we (implicitly) condition on it in all probability statements in that context. Nevertheless, I'll look at things more broadly so that you can see what probabilistic statements are true/false within a generalised framework where you examine the CI on a purely mathematical basis.

In order to show you what statements about confidence intervals are true/false, we will use more detailed notation. Let $\text{CI}_\theta(\mathbf{X}, \alpha)$ denote the $1-\alpha$ level confidence interval for $\theta \in \Theta$ using (random) data vector $\mathbf{X}$. This object is a mapping $\text{CI}_\theta: \mathbb{R}^n \times [0,1] \rightarrow \mathfrak{p}(\mathbb{R})$ that maps an input data vector and significance value to a measurable subset of the real numbers. (For a confidence interval the output of the function is a single connected interval, but you can generalise to use confidence sets if you want to remove this restriction.) As I've noted in several other answers (some for questions you link to), an exact confidence interval is defined by the following property:

$$\mathbb{P}(\theta \in \text{CI}_\theta(\mathbf{X}, \alpha) | \theta) = 1-\alpha \quad \quad \quad \quad \text{for all } \theta \in \Theta.$$

(An approximate confidence interval is one where there is approximate equality, usually relying on asymptotic distributional results.) Substituting the observed data $\mathbf{X}=\mathbf{x}$ then gives the (fixed) confidence interval $\text{CI}_\theta(\mathbf{x}, \alpha)$. To allow us to assess statements about "repeated experiments" we will let $\mathbf{X}_1, \mathbf{X}_2, \mathbf{X}_3, ...$ denote a sequence of IID random vectors with distribution equivalent to the random vector $\mathbf{X}$.

So, assuming you are using an exact confidence interval, the following statements are true/false$^\dagger$:

$$\begin{align} \mathbb{P}(\theta \in \text{CI}_\theta(\mathbf{X}, \alpha) | \theta) &= 1-\alpha \quad \quad \quad \quad \quad \quad \quad \quad \text{True} \\[12pt] \mathbb{P}(\text{CI}_\theta(\mathbf{X}, \alpha) \ni \theta | \theta) &= 1-\alpha \quad \quad \quad \quad \quad \quad \quad \quad \text{True} \\[12pt] \mathbb{P}(\theta \in \text{CI}_\theta(\mathbf{X}, \alpha)) &= 1-\alpha \quad \quad \quad \quad \quad \quad \quad \quad \text{True} \\[12pt] \mathbb{P}(\text{CI}_\theta(\mathbf{X}, \alpha) \ni \theta) &= 1-\alpha \quad \quad \quad \quad \quad \quad \quad \quad \text{True} \\[12pt] -------------&---------------- \\[6pt] \mathbb{P}(\theta \in \text{CI}_\theta(\mathbf{x}, \alpha) | \theta) &= 1-\alpha \quad \quad \quad \quad \quad \quad \quad \quad \text{False}^\dagger \\[12pt] \mathbb{P}(\text{CI}_\theta(\mathbf{x}, \alpha) \ni \theta | \theta) &= 1-\alpha \quad \quad \quad \quad \quad \quad \quad \quad \text{False}^\dagger \\[12pt] \mathbb{P}(\theta \in \text{CI}_\theta(\mathbf{x}, \alpha)) &= 1-\alpha \quad \quad \quad \quad \quad \quad \quad \quad \text{False}^\dagger \\[12pt] \mathbb{P}(\text{CI}_\theta(\mathbf{x}, \alpha) \ni \theta) &= 1-\alpha \quad \quad \quad \quad \quad \quad \quad \quad \text{False}^\dagger \\[12pt] -------------&---------------- \\[6pt] \mathbb{P} \bigg( \lim_{k \rightarrow \infty} \frac{1}{k} \sum_{i=1}^k \mathbb{I}(\theta \in \text{CI}_\theta(\mathbf{X}_i, \alpha)) &= 1-\alpha \bigg| \theta \bigg) = 1 \quad \quad \quad \quad \ \ \text{True} \\[6pt] \mathbb{P} \bigg( \lim_{k \rightarrow \infty} \frac{1}{k} \sum_{i=1}^k \mathbb{I}(\theta \in \text{CI}_\theta(\mathbf{X}_i, \alpha)) &= 1-\alpha \bigg) = 1 \quad \quad \quad \quad \quad \ \text{True} \\[6pt] \end{align}$$

If you are working in the classical ("frequentist") context, you can ignore the marginal probability statements here and focus entirely on the conditional probability statements. (In that context the parameter is an "unknown constant" and so all our probabilistic analysis implicitly conditions on it having a fixed value.) As you can see, the remaining distinction that determines whether the statement is true/false is whether you are talking about the "data" in its random sense or fixed sense. You also need to take care to state these mathematical conditions clearly and accurately.

$^\dagger$ Statements listed as $\text{False}$ are statements that are not true in general. These statements may be true "coincidentally" for some specific values of the inputs.

Best Answer

Related Solutions

Solved – Why CLMM function for ordinal mixed logistic regression changes the means

Confidence Interval – Why Aren’t These Sentences About Confidence Intervals Equivalent?

Related Question