Solved – When do you use logistic regression vs. when you do use OLS

least squareslogisticregression

If you are creating a regression model where the response variable is a numerical value, but one of the variables is a dummy (binary), can you use OLS-method?

Do you only use logistic regression if your response variable is binary?

Best Answer

In short:

  1. Yes, if your response variable is continuous, even if one of the X variables is binary, you can use OLS.
  2. Yes, you should only use logistic regression if your response variable is binary.

Why is this? To your first question, OLS handles binary X variables just fine. For simplicity, consider a model with two X variables. (Assume that they aren't perfectly correlated.) $$ y_i = \alpha + \beta_1 x_{1i} + \beta_2 x_{2i} + \varepsilon_i $$ If $x_{1i}$ is binary, then we interpret $\beta_1$ as follows: holding $x_{2i}$ constant, it's the predicted change in $y_i$ from observing $x_{i1} = 1$ instead of 0. (This holds whether $x_{2i}$ is continuous or binary, by the way.)

To your second question, binary logistic regression is designed specifically to model a binary response. Recall that the underlying model is the following: for a vector $x_i$, the probability that $y_i = 1$ is $$ P(y_i = 1 \mid x_i) = \frac{\exp(\beta' x_i)}{1 + \exp(\beta' x_i)}. $$ This doesn't generalize to continuous $y$. (It does generalize to categorical $y$; this is called multinomial logistic regression.)

Related Question