Logistic Regression – Assumption of Linearity Between Variables and Log Odds

assumptionsintuitionlinearitylogisticregression

I know that in logistic regression we assume a linear relationship between the independent variables and the logits.
Can you explain why is this a reasonable assumption?

Best Answer

I think this could be answered a few different ways. One interpretation would be to say that it is NOT reasonable (I'd probably land in this camp if you pushed me hard enough). Linearity is the simplest assumption we can make about the effects of the variables, and so we make it. The reason the assumption is about linearity on the log odds scale and not on the natural scale is to avoid unrealistic predictions. Note that the range of the logits is the entire real line. Modelling the effects of the covariates on an unbounded scale prevents us from having to deal with cases where we predict the mean is less than 0 or greater than 1.

A different argument would be that in a neighbourhood of the covariates, the function is approximately linear. In the case where we believe we are examining a sufficiently small neighbourhood of the possible covariate space, maybe this assumption is good enough.