Solved – Ordinary Least Squares Regression with binary dependent variable

least squaresregression

I know that OLS regression is linear and output expected is continuous and values will fall higher than 1 or less than 0 so is no meaning of values what are not between 0 and 1 (here pointing to values 20, 30 etc not strictly around 1), can’t be interpreted.

My questions is: Mathematically speaking: Why is not appropriate to use OLS regression when we have binary dependent variable?

Best Answer

This approach can (and likely will) produce values that are impossible, and there are other methods like logistic regression that avoid this issue. That’s one reason why OLS might not be preferred for such a situation.

However, the math works out fine. Nothing will stop you from fitting an OLS model with your parameters estimated the usual way with $\hat\beta=(X^TX)^{-1}X^Ty$. If such a model works best for what you’re doing, as proponents of linear probability models believe, then go for it!