Solved – Make Nonlinear Smooth Interpretable in Logistic GAM Regression

generalized-additive-modelinterpretationmarginal-effectr

I have a nonlinear smooth fit in a logistic regression from the package mgcv in R. Visualizing the smooth, the y-axis I get using either plot(mymod) or predict.gam(mymod, type="terms") is in log-odds. I would like to change the y-axis to be something more interpretable.

If this was a linear regression and there was just one linear coefficient to interpret, I would calculate the average marginal effect for that coefficient. However, since the effect is nonlinear (it is a smooth spline), and I am trying to interpret the y-value at each given x-value, I do not think a marginal effect (change in x-value from 0 to 1) is exactly what I am looking for.

To put it more concretely, I have the following plot:enter image description here

I could estimate the average marginal effect by calculating, for every observation, the average change in the probability of the outcome as the Predictor Variable changes from 0 to 1. But this tells me nothing about the effect on the probability of the outcome when Predictor Variable is equal to -2. How can I convert that change in log-odds when Predictor Variable is equal to -2 (0.1629) into a more interpretable value, like the change in the probability of the outcome?

Best Answer

I don't think you can achieve what you want as you need the intercept to properly calculate the estimated probability for a given value of the covariate. predict(mymod, type = "response") will get you that.

This becomes more difficult when you have additional covariates in the model if you want to look at the effect of varying covariate $x$ on the response. In that situation you need to hold the other covariates at some representative value and then predict() (on the type = "link" scale if want to compute a confidence interval from the standard error, and then back transform to the response scale).

But either way, you need the intercept (constant) term.

Log-odds isn't always so immediately convenient, but some pointers can make these plots easier to understand. 0 represents the overall mean, so if log-odds are positive for some values of the covariate $x$, the probability of the event is higher that average. If the log-odds are 0, or close to it, for a some values of $x$ the probability would be unchanged, and similarly negative log-odds would indicate probability is below the average for those covariate values. But if you really want a probability value, then you would need to predict() and hold any other covariates in the model at some representative value.

Related Question