Solved – Multilevel logistic regression with a random slope(s)

interpretationlogisticmultilevel-analysisregression

I would like to specify a two-level logistic regression model with random intercept and random slope.

Dependent variable: hospitalization (1) or no-hospitalization (0).

Independent variables: age, number of drugs used, comorbidity, others…

Multilevel structure: patients clustered within hospitals. Hospitalization rate varies across different hospitals.

The goal is to identify variables in the model that explain the largest part of variation in hospitalization rate.

I have two questions:

  1. Can I run several models each time allowing for random intercept + random slope including all other variables as fixed. I would like to have a look at the size of variance that is explained by each patients' characteristic, but when i add several random slopes model does not converge.

  2. When I specify the random slope then the intercept variance is inflated. How can I interpret that?

Best Answer

  1. Do you mean that you are turning age, number of drugs, etc. into random effects? When you do that, you are assuming that the impact of age, drugs, comorbities, and so on, differs from hospital to hospital. And in some way, this impact is distinct from the basic "hospital" effect that you have already included. It doesn't surprise me that the model does not converge, since you are going to have a lot of parameters here, not all of which may be needed by the model. Useless parameters do not change the value of the likelihood much, so the optimizer could end up roaming around in a trough somewhere, not sure where to go next.

And furthermore if a particular effect is not random --- or the true variance is extremely small --- then the optimizer will be directed towards the boundary of the parameter space (since the variance can't be negative) - this would cause the optimizer to stop where the gradient was non-zero and the optimizer would throw a non-convergence error.

  1. This is not strange. If slopes are needed in the model, but you omit them, the intercept parameter will have to cover the differences between hospitals by itself. This will force the intercept parameter to cover a wider range of values than it would otherwise need to, and hence it will have a larger variance.

enter image description here

Pretend that your data look like the figure. The true model is shown by the green lines: common intercept, but different slopes. However, if you fit a model with different intercepts and 0 slope, you will get something like the red lines, as the intercept struggles to carry the full burden of the variation in the data all by itself. That's why the variance increases when the slopes are omitted.

And by the way ... welcome to the site and good luck with your analysis.