Solved – zero conditional mean assumption coupled with random sampling assumption (deriving unbiasedness)

least squaresunbiased-estimator

I don't get part of the explanation of deriving unbiasedness of OLS in my textbook.

I understand that to derive unbiasedness we have to use conditional expectation (conditioning on $x$) so that the error term goes to zero $E(u|x)=0$ and we can prove unbiasedness of OLS.

The author of my textbook writes:

In addition to restricting the relationship between u and x in the population, the zero conditional mean assumption – coupled with the random sampling assumption – allows for technical simplification. We can derive the statistical properties of the OLS estimators as conditional on the values of $x_i$ in our sample. Technically, in statistical derivations, conditioning on the sample values of the independent variable is the same as treating the $x_i$ as fixed in repeated samples.

My question is now:
Why is conditioning on sample values of $x_i$ the same as treating $x_i$ fixed in repeated samples?
In my opinion unbiasedness of OLS can be derived just by using the conditional expectation properties.
I hope someone may explain me intuitively how this works with the zero conditional mean assumption coupled with the random sampling assumption. And why exactly the random sampling assumption makes such a difference/simplification?

Best Answer

this is a tricky point in most books in econometrics. The main point is that to demonstrate that the estimators (beta) are unbiased, you need the zero conditional mean assumption which is E[u|X]=0. The trick is that the conditional mean assumption refers to the expectation of u given all observation in the sample (all x's). When authors are introducing regression models in their books, they implicitly use the zero conditional mean assumption referring only to the x related to the same observation of u.

If you jump to the chapter on time series on your handbook you will note this distinction, since the author will explicitly state that the zero conditional mean assumption refers to the entire set of samples of X and not only to the contemporaneous X. This make sense under time series analysis, where random sampling cannot be assumed.

Related Question