Solved – Is E(u|x)=0 is a required condition for estimator consistency

consistencyexpected valueself-study

For OLS parameter estimates to be consistent it must be the case that
E(u|x)=0. Is it true?
E(u|x)=0 is a required condition for unbiasedness. But as far as I understand, unbiasedness does not necessarily mean consistency. Therefore I am really confused.

Best Answer

Ok. The model is, in matrix notation and conformable dimensions $$\mathbf y = \mathbf X\beta + \mathbf u $$

The $OLS$ estimator is

$$\hat \beta = (\mathbf X'\mathbf X)^{-1}\mathbf X' \mathbf y = (\mathbf X'\mathbf X)^{-1}\mathbf X' (\mathbf X\beta + \mathbf u) $$

$$= (\mathbf X'\mathbf X)^{-1}\mathbf X' \mathbf X\beta + (\mathbf X'\mathbf X)^{-1}\mathbf X'\mathbf u = \beta + (\mathbf X'\mathbf X)^{-1}\mathbf X'\mathbf u$$

For consistency we examine

$$\operatorname{plim}\hat \beta = \operatorname{plim}\beta + \operatorname{plim}\left[(\mathbf X'\mathbf X)^{-1}\mathbf X'\mathbf u\right] = \beta + \operatorname{plim}\left[\left(\frac 1n\mathbf X'\mathbf X\right)^{-1}\left(\frac 1n\mathbf X'\mathbf u\right)\right] $$

And here is the crucial point that makes us need a weaker assumption for consistency compared to unbiasedness: for unbiasedness we would face $E\left[(\mathbf X'\mathbf X)^{-1}\mathbf X'\mathbf u\right]$, and in order to "insert" the expected value into the expression we have to condition on $\mathbf X$, which leads us to the expression $E(\mathbf u\mid \mathbf X)$ and the need to assume it as being equal to zero, i.e. assume "mean-independence" between the error term and the regressors.

But $\operatorname{plim}$ is a more "flexible" operator than $E$: under $\operatorname{plim}$ expressions and products can be decomposed (something that under the expected value requires independence), and also $\operatorname{plim}$ can "go inside the expression" (while $E$ cannot except if it is an affine function), as long as the function is a continuous transformation (and it very rarely isn't) - so

$$\operatorname{plim}\left[\left(\frac 1n\mathbf X'\mathbf X\right)^{-1}\left(\frac 1n\mathbf X'\mathbf u\right)\right] = \operatorname{plim}\left(\frac 1n\mathbf X'\mathbf X\right)^{-1}\operatorname{plim}\left(\frac 1n\mathbf X'\mathbf u\right)$$

For consistency we need to assume that the first $\operatorname{plim}$ is finite -but this is an assumption on the properties of the regressor matrix, unrelated to the error term. So we are left with the second $\operatorname{plim}$ which, written for clarity using sums it is $$\operatorname{plim}\left(\frac 1n\mathbf X'\mathbf u\right) = \left[\begin{matrix} \operatorname{plim}\frac 1n\sum_{i=1}^nx_{1i}u_i \\ .\\ .\\ \operatorname{plim}\frac 1n\sum_{i=1}^nx_{ki}u_i \\ \end{matrix}\right] \rightarrow\left[\begin{matrix} \frac 1n\sum_{i=1}^nE(x_{1i}u_i) \\ .\\ .\\ \frac 1n\sum_{i=1}^nE(x_{ki}u_i) \\ \end{matrix}\right] $$ ...the last transformation due to the usual assumptions that permit the application of the law of large numbers.

Exactly because we have been able to "separate" $(\mathbf X'\mathbf X)^{-1}$ from $\mathbf X'\mathbf u$ (due to the fact that we are examining the $\operatorname{plim}$ and not $E$) we ended up looking only at the contemporaneous relation between each regressor and the error term. And so what we need to assume for consistency of the $OLS$ estimator is only that $E(x_{1i}u_i) =0 \; \forall k, \; \forall i$, (contemporaneous uncorrelatedness) which is much weaker than $E(\mathbf u\mid \mathbf X)$, the latter requiring mean-independence, and moreover, not only contemporaneous independence, but across time too (since we condition the whole error vector on the whole regressor matrix).