Solved – Testing nonlinearity in logistic regression (or other forms of regression)

assumptionslogisticreferencesregressionregression-strategies

One of the assumption of logistic regression is the linearity in the logit. So once I got my model up and running I test for nonlinearity using Box-Tidwell test. One of my continuous predictors (X) has tested positive for nonlinearity. What am I suppose to do next?

As this is a violation of the assumptions shall I get rid of the predictor (X) or include the nonlinear transformation (X*X). Or transform the variable into a categorical?
If you have a reference could you please point me to that too?

Best Answer

I would suggest to use restricted cubic splines (rcs in R, see the Hmisc and Design packages for examples of use), instead of adding power of $X$ in your model. This approach is the one that is recommended by Frank Harrell, for instance, and you will find a nice illustration in his handouts (ยง2.5 and chap. 9) on Regression Modeling Strategies (see the companion website).

You can compare the results with your Box-Tidwell test by using the boxTidwell() in the car package.

Transforming continuous predictors into categorical ones is generally not a good idea, see e.g. Problems Caused by Categorizing Continuous Variables.