I am using "lme4" package to fit mixed-effects nonlinear logistic model to access the association of Y and X. As the response variable of my data is binary and nlmer function requires response variable to be continuous, I use glmer function and "rms" package function rcs to fit the model and visualize the nonlinear association like the R code below:
library(lme4)
library(rms)
m <- glmer(r2 ~ rcs(Anger, 5) + Gender + situ + btype + (1 | id),
data = VerbAgg, family = binomial("logit"),
control=glmerControl(optimizer="bobyqa"))
p <- predict(m, newdata = VerbAgg, type = "link")
scatter.smooth(VerbAgg$Anger,p,pch='.',col="blue",lpars=list(type="l",col="red"))
I have some questions about using this code:
- Is the code correct?
- How to predict the ln(Odds) of r2? Is it "p <- predict(m, newdata = VerbAgg, type = "link")" ?
- How to visualize the spline of ln(Odds) of r2 and Anger? Is it correct to use "scatter.smooth" function to plot and add a smooth curve in scatters?
- How to get the P-nonlinear for this model?
Best Answer
Your code fits a binomial model for
r2
with main effectsAnger
(transformed with a restricted cubic spline with 5 knots),Gender
,situ
andbtype
, with a random intercept perid
, and it runs. If that is what you wanted, then your code is correct.I get a number of warnings about near unidentfiability and non-convergence. Five spline knots is a lot; it may be useful to reduce this number. I changed your model to one with only three knots, which converges nicely:
Odds ratios and log odds always compare the odds between two situations (i.e., predictor settings) A and B. The two situations might differ only in
Anger
, with all other predictors being equal between the two situations. Or they might have the sameAnger
, but differ onsitu
. So the first thing you have to decide is which situations you want to compare.I would love to show a simple way of calculating this, but I got an error I can't resolve. So I'll show the way "by hand". Suppose situation A is
Anger=17
, and situation B isAnger=21
, and all other predictors are identical between the two situations, e.g.,Gender=M
,situ=other
,btype=curse
(but the specific settings don't matter actually, because everything will cancel).Now, we first have to spline transform the new
Anger
using the knots and locations from the original data:We set up the
newdata
for predicting:Now, the following should work:
But unfortunately, it throws a cryptic error that I can't resolve even after googling. So we will do this the hard way. Here are our fixed effects parameter estimates:
Here are the spline-transformed new
Anger
values:Now, in logistic regression, the log odds are just the difference in the evaluated link function between the two situations. Since the other predictors are equal and we have no interactions, their contributions to the link function cancel. We are left with
$$ (17-21)\times 0.10035866 + (0.187500-2.286706)\times(-0.05907553) = -0.2774229.$$
The first components come from the design matrices (the linear and a single non-linear spline transform), the second ones are the parameter estimates. If you want to find the log odds for some more complicated pair of situations, you will need to include the corresponding parameter estimate. Also note that this is on population level across all
id
s - of course you could also estimate log odds between two specificid
s, and then you would include their random effects.Per above, it makes no sense to visualize "the log odds of
Anger
" (whether spline transformed or not), because (log) odds inherently compare two situations, so there is no "log odds ofAnger=20
" without a comparison to some other situation. But you can calculate useful log odds as above and plot them. There is just nothing useful to plot againstAnger
here. (You could plot response predictions for varying levels ofAnger
againstAnger
. For this, you would need to decide which settings for the other predictors you are interested in.)ANOVA and similar methods always compare two different models, e.g., a focal model of interest against a null (intercept-only) model. If I understand you correctly, you want to compare your mixed model against a logistic regression without random effects. That is an entire can of worms by itself, which has been addressed a couple of times here. You really want a likelihood ratio test, here are relevant threads. This looks most promising: LRT comparing a random effects model and nested logistic regression model.