OLS Regression – Understanding Strict Exogeneity Condition

econometricsexogeneityleast squaresregression

In Hayashi's Econometrics, it is stated that one of the assumption of classical OLS is: $$\mathbb{E}(\epsilon_i\lvert\mathbf{x_1}, \mathbf{x_2}, \ldots, \mathbf{x_n}) = 0 \text{, for } i=1, \ldots, n. \tag{1}$$ And I know that the implications are that $\mathbb{E}(\epsilon_i) = 0$ for all $i = 1, \ldots,n$, and that the error term is uncorrelated with the regressors.

But, what does the equation (1) in itself actually mean? A pedagogical example would be helpful.

Best Answer

In English, it means that conditional on observing the data, the expectation of the error term is zero.

How might this be violated?

Example: omitted variable correlated with $x$

Imagine the true model is: $$ y_i = \alpha + \beta x_i + \gamma z_i + u_i$$

But instead imagine we're running the regression: $$ y_i = \alpha + \beta x_i + \underbrace{\epsilon_i}_{\gamma z_i + u_i}$$

Then: $$\begin{align*} E[\epsilon_i \mid x_i ] &= E[\gamma z_i + u_i \mid x_i] \\ &=\gamma E[ z_i\mid x_i] \quad \text{ assuming $u_i$ is white noise} \end{align*}$$

If $E[z_i \mid x_i] \neq 0$ and $\gamma \neq 0$, then $E[\epsilon_i \mid x_i] \neq 0$ and strict exogeneity is violated.

For example, imagine $y$ is wages, $x$ is an indicator for a college degree, and $z$ is some measure of ability. If wages are a function of both education and ability (the true data generating process is the first equation), and college graduates are expected to have higher ability ($E[z_i \mid x_i] \neq 0]$) because college tends to attract and admit higher ability students, then if one were to run a simple regression of wages on education, the strict exogeneity assumption would be violated. We have a classic confounding variable. Ability causes education, and ability affects wages, hence our expectation of the error in equation (2) given education isn't zero.

What would happen if we did run the regression? You would pickup both the education effect and the ability effect in the education coefficient. In this simple linear example, the estimated coefficient $b$ would pick up the effect of $x$ on $y$ plus the association of $x$ and $z$ times the effect of $z$ on $y$.

Related Question