Endogeneity – Does an Endogenous Variable Bias the Coefficient of the Exogenous One?

biascausalityendogeneityleast squaresunbiased-estimator

We have the following model:

$$ y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \epsilon. $$

We know that:

\begin{align*}
\operatorname{Cov}(x_1, \epsilon) &\neq 0 \\
\operatorname{Cov}(x_2, \epsilon) &= 0.
\end{align*}

We estimate this model using OLS, obtaining the coefficients $\hat{\beta}_0$, $\hat{\beta}_1$ and $\hat{\beta}_2$.

My question is: is $\mathbb{E}(\hat{\beta}_2) = \beta_2?$

Best Answer

Well, except in the multivariate normal case, zero covariance does not imply independence. You have not specified any distributions, so we cannot assume multivariate normal distributions. So technically, as stated, we cannot conclude that $\hat\beta_2$ is unbiased. However, if there is a way to show that $x_2$ and $\epsilon$ are independent, I think you're in business. From the causal diagram perspective, you would likely model independence as the absence of any backdoor path from $x_2$ to $y,$ which means that your model would produce an unbiased $\hat\beta_2.$ Incidentally, a non-zero covariance doesn't necessarily mean, even, that you have a problem with $x_1.$ If $x_1$ causally influences $\epsilon,$ then $\epsilon$ is merely a mediator and there is no confounding. However, if $\epsilon$ influences $x_1,$ then you have the backdoor path $x_1\leftarrow\epsilon\rightarrow y,$ and finding the causal effect of $x_1$ on $y$ becomes more difficult - though I realize you're actually asking about $x_2.$