Solved – How to partial out a covariate in logistic regression

logisticregressionresidualsstepwise regression

I was performing logistic regression on some data, and I realised that I need to remove or partial out the effects of another covariate.

If $x$ is the predictor of interest, the model is:

$$ y \sim \mathbf{logit}^{-1}(ax+b) $$

One option would be to include the covariate $z$ as another regressor:

$$ y \sim \mathbf{logit}^{-1}(ax+b + cz) $$

But in general $z$ might be correlated with $x$. So for my purposes, it is important to first regress out $z$. I don't want the parameter of interest $a$ to reflect any variance that could be explained by $z$.

If this were linear regression, I'd take the residuals from the regression $y\sim cz+b$, then regress these residuals against $x$: $(y-\hat{y})\sim x$ to obtain the parameter of interest. (By the way, is there a name for this process, of regressing out a variable that is not of interest?)

But for a logistic regression, I am not sure how to treat the residuals from the first regression. Say I do $\hat{y}=cz+b$. Can I attempt a logistic regression on the (non-binary) residuals $(y-\hat{y})$? Or perhaps perform a linear regression on some transformed function of the residuals? I am thinking of something like:

$$\log\bigg(\frac{\hat{y}-y}{y-\hat{y}}\bigg)$$

Best Answer

You don't do this. You shouldn't do what you are describing in a linear regression context either. All you need to do is include both variables in a multiple regression (multiple logistic regression) model. That will take care of this for you. Moreover, it won't matter if $x$ is correlated with $z$. If they are, then the standard errors will be larger (appropriately), but the estimated coefficients will be correct.

You may be interested in reading:

Related Question