Solved – Probabilities of odds ratios in random intercept models

lme4-nlmemixed modelodds-ratioprobability

I'm using R and the lme4 package to compute mixed effects models with binary outcome (glmer). I have included continuous coefficients (e.g. how many hours per week does a person care for an elderly relative) and – as it is a comparison of 6 countries – country as random intercept (see my question here for details on the model).

Now I'd like to interpret the fixed effects with consideration of the random effects (country intercept).

I do this already for fixed effects without considering random effects, by multiplying each "x" value from the coefficients with their related estimates (xbeta, first line in sample code below) and then have a formula to convert intercept of fixed effects + each xbeta from odds ratio to probabilities:

mydf.vals$xbeta <- mydf.vals$value * (fixef(fit)[coef.pos])
mydf.vals$prob <- (1/(1+exp(-(fixef(fit)[1] + mydf.vals$xbeta))))

(this is roughly the same approach as described like in this question)

Now my question is: If I'd like to see how a coefficients / an odds ratio varies between countries (random intercept), would it be correct to retrieve random effects (with ranef) to get the estimates (intercepts) for each country level and then repeat the above formula for each country level?

for example:

ranef(fit)
$g2ctry
   (Intercept)
30 -0.05605686
39  0.44139287
44 -0.23863412
46 -0.29867999
48  0.41890025
49 -0.27687811

rand.ef <- ranef(fit)[[1]]
for (i in 1 : nrow(rand.ef)) {
    mydf.vals$prob <- (1/(1+exp(-(rand.ef[i, ] + mydf.vals$xbeta))))
    # plot probability curve here...
}

Or more general: If I have random effects from a random intercept model, why and how should I link the random effects (intercepts of group levels) to the fixed effects coefficients? (so, what are the benefits of knowing the random effects of the grouping variable (random intercept))

EDIT: some kind of reproducible example

Since my data set is quite large and I don't know if I can upload it, this example uses the data set sleepstudy from the lme4 package. The function for creating the plots (sjp.glmer) is taken from the current development snapshot of my sjPlot package.

library(lme4)
# create binary response
sleepstudy$Reaction.dicho <- ifelse(sleepstudy$Reaction <= median( sleepstudy$Reaction.dicho, na.rm=T),0,1)
# fit model
fit <- glmer(Reaction.dicho ~ Days + (1 | Subject),
             sleepstudy,
             family = binomial("logit"))
# plot random effects and probability curve of fixed effects
sjp.glmer(fit, showContPredPlots = T)

The above code produces following two plots:

enter image description here

The first plot shows the random effects as retrieved by ranef, plus conf.int, which are retrieved by arm::se.ranef * 1.96.

The "probability curve" in the 2nd plot is calculated by multiplying fixed-effects-intercept (-3.4601, see summary(fit)) with each value from Days (range from 0 to 9) multiplied with Days's estimate.

Example:

x1 = (1 / (1 + exp(-(-3.4601 + 0 * 0.7426))))
x2 = (1 / (1 + exp(-(-3.4601 + 1 * 0.7426))))
# and so on, from 0 to 9       ^ here

Now, this "probability curve" is only based on fixed effects. My question is, if it would make sense to plot this curve for each "intercept estimate" from the random effects, i.e. to use the above formula and replace -3.4601 with each random effects value (which are, in this case, random intercepts, i.e. the intercepts of each group level).

Would this be an appropriate way to interpret the "differences" or variance of ´Days` in each group?

EDIT 2: Another example

Let me give a comprehensive example of what I want to do:

library(lme4)
library(reshape2)
library(ggplot2)
# create binary response
sleepstudy$Reaction.dicho <- ifelse(sleepstudy$Reaction <= median(sleepstudy$Reaction, na.rm = T), 0, 1)
# fit model
fit <- glmer(Reaction.dicho ~ Days + (1 | Subject),
             sleepstudy,
             family = binomial("logit"))
# get random effects (random intercept estimates)
rand.ef <- ranef(fit)[[1]]
# find unique values of continuous coefficient,
# for x axis
vals.unique <- sort(unique(sleepstudy$Days))
# melt variable
mydf.vals <- data.frame(melt(vals.unique))
# add "counter" from 1 to length of unique vals
# in this particular case, "Days" is also a "normal" sequence,
# so there's not much benefit here - however, if you have e.g.
# "workhours per week", you may have values from 20 to 80, with certain
# values not in the data (if no one works 25 hours). In that case, 
# the following "counter"-sequence makes sense
mydf.vals <- cbind(seq(from = 1, to = nrow(mydf.vals), by = 1), mydf.vals)
# set colnames. x = x-axis value, "value" is the "real" data value,
# which was observed
colnames(mydf.vals) <- c("x", "value")
# calculate x-beta by multiplying original values of "Days" 
# with estimate of "Days"
mydf.vals$xbeta <- mydf.vals$value * fixef(fit)[2]
# the data frame for plotting
final.df <- data.frame()
final.grp <- c()
# example only for the first 6 grouping levels,
# plot would else be too overloaded...
for (i in 1 : 6) {
  # y-value (probability of odds ratio), by adding x-betas from
  # "Days" to each random intercept estimate (estimate for each
  # group level)
  mydf.vals$prob <- (1/(1+exp(-(rand.ef[i, ] + mydf.vals$xbeta))))
  final.df <- rbind(final.df, cbind(days = mydf.vals$x, 
                                    prob = mydf.vals$prob))
  # need to add grp vector later to data frame,
  # else "x" and "prob" would be coerced to factors
  final.grp <- c(final.grp, 
                 rep(row.names(rand.ef)[i], times = length(mydf.vals$x)))
}
# add grp vector
final.df$subject <- final.grp
# plot probability curve here...
ggplot(final.df, aes(x = days, y = prob, colour = subject)) +
  geom_point() +
  geom_line()

enter image description here

Above we see the prob. curves for each Subject along each Days value, calculated by "summed" odds ratio of each Subject`s intercept + each Days' estimates. So, I "linked" random effects and fixed effects.

Is this ok, or is it nonsense to do that? My aim is to say: Taking Subject variance into account, we see a delay in reaction time for Subject 310, while for subject 331 it is much more likely to have high reactions.

Best Answer

Finally... @BenBoker was right with predict and plogis. What I am exactly looking for is the predicted values for model terms (i.e. plogis(predict(fit, type = "terms")), however, I'm not sure how to get predicted values for model terms from merMod objects. predict.merMod has no type = "terms" option.

Related Solutions

Solved – Interpreting random effects in mixed effects models

There are several aspects to your question, and I am not sure I truly understand it fully. With this big caveat in mind, let me try and shed some light into some of the issues that seem to be a concern in your question.

First off, the difference between scaled and not scaled data is self-evident if you look into what's happening within the function, following its conceptual (albeit sometimes not computational) built-in steps. Let's first center and scale (in other words scale(..., scale=TRUE, center=TRUE)) the data "manually", which simply entails subtracting the mean of Reaction from every reaction time in the data cloud and then divide by the standard deviation of Reaction:

    SD <- sd(sleepstudy$Reaction)
    M <- mean(sleepstudy$Reaction)
    scaled <- scale(sleepstudy$Reaction, center = TRUE, scale = TRUE)

And compare to your scaled data using the R function: identical(sleepstudy$zReaction, scaled) [1] TRUE.

So presumably, we can recover the intercepts of fm1 if we just un-scale the results - i.e. we multiply the coefficients by the standard deviation, and then add the mean:

all.equal(coef(fm2)$Subject[,1]*SD + M, coef(fm1)$Subject[,1]) [1] TRUE

I hope this clarifies the issue with the scaling of the data.

As for the second part regarding the 35.07 variance in the slopes across subjects, we are trying to sort out the spread in the slope values across individuals, which we assume are normally distributed. This is difficult to reconcile with the end of your question, but I hope it helps.

Solved – Is it reasonable to include a random slope term in an lmer model without the corresponding fixed effect

I believe this question to be very similar to the often wondered "must one always include an intercept term in a linear regression", for which the agreed upon answer is "yes, unless you have an extremely good reason not to".

I tried to think through what would happen without the fixed effect term before running any experiment. Let's write your two models out in detail. The first, with the fixed effect slope, is

$$ y \sim N(\mu_{\alpha} +\alpha_{[i]} + (\mu_{\beta} + \beta_{[i]}) x, \sigma) $$ $$ \alpha \sim N(0, \sigma_{\alpha}) $$ $$ \beta \sim N(0, \sigma_{\beta}) $$

where $x$ is the number of days, and we have a random intercept $\alpha_{[i]}$, and a random slope $\beta_{[i]}$ for each subject. In the other case, where there is no fixed slope, the model is

$$ y \sim N(\mu_{\alpha} + \alpha_{[i]} + \beta_{[i]} x, \sigma) $$ $$ \alpha \sim N(0, \sigma_{\alpha}) $$ $$ \beta \sim N(0, \sigma_{\beta}) $$

The difference is that in the second model, we a priori assume that the mean of the random slopes is zero. This means, we expect the slopes associated to the various subjects to distribute evenly around a slope of $0$ (for example, half should be negative and half positive).

Now, in the model on your data this does not seem to be true. In your second plot the estimated slopes within each subject are all positive. It looks like this model is invalid for your data. The inclusion of the fixed slope includes the mean of the subject-wise slopes as a degree of freedom, and in this plot you see the random slopes cluster evenly around zero, as you would like.

As for inference from the parameters in your model, I believe this misstatement of the model will cause the following parameter estimates to be bias

The subject-wise slopes will be biased towards zero, because the assumption of mean zero in the likelihood will pull them towards zero.
The estimated standard deviation of the random slopes will be too large, because inflating this parameter lets the slopes cluster around their true, non-zero mean without being penalized so severely.

Here I'll create some simulated data where the true subject-wise mean slope is non-zero

library("lme4")
library("arm")
set.seed(154)

N_classes = 50
N_obs <- 10000

random_intercepts <- structure(
  rnorm(N_classes), names = as.character(1:N_classes)
)

random_slopes <- structure(
  rnorm(N_classes, mean = 1), names = as.character(1:N_classes)
)

classes <- sample(as.character(1:N_classes), size = N_obs, replace = TRUE)
x <- runif(N_obs)
y <- random_intercepts[classes] + random_slopes[classes] * x + rnorm(N_obs)

df <- data.frame(class = factor(classes), x = x, y = y)

The first model estiamtes all true parameters well

> M <- lmer(y ~ x + (x | class), data = df)
> display(M)
lmer(formula = y ~ x + (x | class), data = df)
        coef.est coef.se
(Intercept) 0.01     0.15   
x           1.02     0.15   

Error terms:
 Groups   Name        Std.Dev. Corr 
 class    (Intercept) 1.03          
          x           1.01     0.19 
 Residual             1.00

Look's like here all the parameters are estimated well, including the standard deviation of the random slopes.

Here's the model without the fixed slope

> N <- lmer(y ~ (x | class), data = df)
> display(N)
lmer(formula = y ~ (x | class), data = df)
coef.est  coef.se 
   -0.14     0.15 

Error terms:
 Groups   Name        Std.Dev. Corr 
 class    (Intercept) 1.04          
          x           1.43     0.24 
 Residual             1.00

The estimate of the random slope standard deviation is 1.43, confirming my intuition that it would be biased to be too large.

The mean of the subject-wise slopes in the model M comes out well

> mean(fixef(M)["x"] + ranef(M)$class$x)
[1] 1.015418

It doesn't seem like my intuition was quite correct on the other model

> mean(ranef(N)$class$x)
[1] 0.9858566

It looks like the model took fitting the data a bit more seriously than making sure the normality of random slope assumption was totally met. Altogether, it looks like the inflation of the random slope standard deviation is the most serious issue.

Best Answer

Related Solutions

Solved – Interpreting random effects in mixed effects models

Solved – Is it reasonable to include a random slope term in an lmer model without the corresponding fixed effect

Related Question