Solved – Why can we assume normally distributed errors in probit but not in LPM

assumptionsleast squaresnormal distributionregression

Why are we able to assume normally distributed errors in probit models but not in linear probability models (LPM)?

When used with a binary dependent variable, LPMs violate a few necessary assumptions. In particular, LPMs violate the following assumptions:

Functional form
Probability boundedness between 0 and 1
Homoscedastic errors
Normally distributed errors (for hypothesis testing)

The ways in which probit regression models correct for the first three is understandable.

I don't understand, however, how probit models correct for the violation of normal distribution about the errors: probit models still assume normally distributed errors. Of course, there are also logit models, which assume logistic distribution about the errors.

Best Answer

Conditional probability models like the LPM, the probit and the logit do not have error terms. At their level, the functional specification of the conditional probability is totally disconnected from probabilistic arguments -it is just a mathematical functional form that has perhaps convenient and/or realistic properties.

To be able to "see" the error term, namely the random element, and discuss probabilistic assumptions and distributions, one must apply the "latent variable" approach (common at least in econometrics), through which these conditional probability models are induced by fundamental distributional hypotheses at the initial level.

In this approach the Linear Probability Model is the result of assuming that the error term in the underlying latent-variable regression follows a Uniform that is symmetric around zero.

Assuming a simple regression setting for simplicity, we initially specify that

$$Y^* = b_0+ b_1X + \epsilon,\;\; \epsilon\mid X\sim U(-a,a)$$

The error term has a zero expected value, conditional on the regressors. The cumulative distribution function here is $F_{\epsilon|X}(\epsilon\mid X) = \frac {\epsilon + a}{2a}$

$Y^*$ is unobservable (or it may be observable in principle, but we do not have data on it). But we do have data on the indicator function $Y = I\{Y^*\geq 0\}$

The observed model is then

$$P(Y =1\mid X ) = P(Y^*>0\mid X) = P(b_0+ b_1X + \epsilon>0\mid X) = P(\epsilon >- b_0- b_1X\mid X)$$ $$=1-F_{\epsilon|X}(- b_0- b_1X\mid X) = 1-\frac {- b_0- b_1X + a}{2a} = \frac {a+b_0}{2a}+\frac {b_1}{2a}X$$

$$\Rightarrow P(Y =1\mid X )= \beta_0 + \beta_1X$$

which is the Linear Probability model with the mapping

$$\beta_0 = \frac {a+b_0}{2a},\;\; \beta_1=\frac{b_1}{2a}$$

So it is not about "correcting" anything. If indeed the underlying data generating mechanism is as assumed above, then the LPM is the correct model specification, and the probit or the logit would be a misspecification.

So also the probit model does not "correct" from the things mentioned in the question -we assume that the underlying error distribution is normal to begin with. Likewise with the logit model and the logistic distribution.

Best Answer

Related Solutions

Solved – Normally distributed errors – Why not use the observed residual histogram

Solved – Why does poisson regression need to assume observations are poisson distributed

Related Question