Solved – OLS – difference between exogeneity and homoscedasticity

econometricsendogeneityexogeneityheteroscedasticityleast squares

I was wondering what the difference between the concepts of 'homoscedasticity/heteroscedasticity' and 'exogenity/endogenity' is when it comes to Ordinary Least Squares estimation.

In my view, they are both defined by the existance of correlation between the x-variable and the residuals. This is also more or less confirmed by for example wikipedia (I cant post more than two links, but if you google endogenity (econometrics) and heteroscedasticity you can verify this).

The book that I am using is this one: http://www.listinet.com/bibliografia-comuna/Cdu339-A719.pdf

They explain homoscedasticity by stating that the variances of the n disturbances exist and are all equal (and cannot be thus influenced by the x-variables?) – see page 93 of the book

$$\operatorname{E}(\epsilon^2_i)=\sigma^2$$

They explain exogeneity by stating that the probability limit of (1/n)X'e should converge to 0, where X' is the matrix with the observations of the explanatory variable. Furthermore, they state that the condition basically means that the explanatory variables should be asymptotically uncorrelated with the disturbances. – see page 194 of the book

$$\operatorname{plim}(\frac{1}{n}X'\epsilon)=0.$$

I hope someone can explain the difference! Thanks a lot in advance!

Best Answer

Random variables $\{u_i; i=1,...,n\}$ are said to be "homoskedastic" when

$$\text{Var}(u_i) = \text{constant},\;\; \forall i$$

This property can coexist with conditional heteroskedasticity:

$$\text{Var}(u_i \mid \mathbf x_i) = h(\mathbf x_i)$$

This is because, by the Law of Total Variance, we have

$$\text{Var}(u_i) = E\big[\text{Var}(u_i \mid \mathbf x_i)] + \text{Var}\big[E(u_i\mid \mathbf x_i)\big]$$

$$= E[h(\mathbf x_i)]+\text{Var}\big[E(u_i\mid \mathbf x_i)\big]$$

The second term is a moment of the distribution of the random variable $E(u_i\mid \mathbf x_i)$, and so a constant (irrespective of whether $E(u_i\mid \mathbf x_i)=0$ or not) . The first term will be a constant over $i$ if the $\mathbf x_i$'s have equal mean over $i$.
In other words if the corresponding collection of regressors are "first-order stationary", then we can have conditional heteroskedasticity and unconditional homoskedasticity at the same time.

As regards the relation/contrast between "homoskedasticity" and "exogeneity", first of all, in fairness to the textbook you mention they actually write in page 194

"This last condition is called the orthogonality condition. If this condition is satisfied, then the explanatory variables are said to be exogenous (or sometimes ‘weakly’ exogenous, to distinguish this type of exogeneity, which is related to consistent estimation, from other types of exogeneity related to forecasting and structural breaks)."

So they do point out that the more accurate term for the property is the "orthogonality" one, while the concept of "exogeneity" has variations in "weak", "strong" or "strict" each reflecting a different assumption.

Now, as regards the essence of the question: conditional homoskedasticity states that

$$\text{Var}(u_i \mid \mathbf x_i) = E(u_i^2\mid \mathbf x_i) - \left[E(u_i\mid \mathbf x_i)\right]^2 = \text{constant}$$

So it is a statement about whether moments of the distribution followed by the $u_i$'s are affected by the presence of $\mathbf x_i$ (or, in an informal informational approach, whether if we know $\mathbf x_i$, the variation that we anticipate to see in $u_i$ as summarized by the variance changes compared to when we don't know $\mathbf x_i$). Keep that "variation" relates to second moments.

On the other hand, the orthogonality property states that $$E(\mathbf x_i\cdot u_i)=0$$

This is a statement about the first moment of a specific function of $\mathbf x_i$ and $u_i$, namely, their product. In a regression setting, where we assume that $E(u_i)=0$, we have that

$$E(\mathbf x_i\cdot u_i)=0 \implies \text{Cov}(\mathbf x_i, u_i)=0$$

So this is a property about whether $\mathbf x_i$ and $u_i$ tend to co-vary.

So, in general informal terms, conditional homoskedasticity and orthogonality, both state that "the $\mathbf x_i$'s do not tell us something about the $u_i$'s" - but this "something" is a different "something" in each case, and usefully and meaningfully distinguished.

Related Question