OLS Regression – Understanding and Interpreting Consistency

consistencyregressionself-study

Many econometrics textbooks (e.g. Wooldridge, "Econometric analysis…") simply write something similar to: "If the population model is $y = xB + u$ and (1) $\text{Cov}(X,U) = 0$; (2) $X'X$ is full rank, then OLS consistently estimate parameters $B$". I've worked out the math behind consistency but I'm a bit lost in interpreting the meaning.

We're talking about consistent estimation, but estimation of what? Please tell me if the following points are correct:

  1. If we have random sample $X,Y$ and $X'X$ is invertible, then we can always define Best Linear Predictor of $y$ given $x$. And then OLS always consistently estimates coefficients of Best Linear Predictor (because in BLP we have $\text{Cov}(u,x)=0$ from the definition). Bottom line: we can always interpret OLS estimates as coefficients of BLP

  2. The only question is whether BLP corresponds to conditional expectation $\text{E}(y|x)$. If it does (for which we need $\text{E}(u|x) = 0$), then we can interpret OLS estimates as partial effects.

What is wrong with this or what am I missing? I don't get the point of stating the assumption of $\text{Cov}(u,x) = 0$, if this assumption is by definition fulfilled in case of Best Linear Predictor. And if we want to give structural interpretation to BLP coefficients (partial effects) we need stronger assumption of $\text{E}(u|x) = 0$ anyway. Is there any other interpretation we may have with zero covariance?

Best Answer

An estimator is consistent if $\hat{\beta} \rightarrow_{p} \beta$

Or $\lim_{n \rightarrow \infty} \mbox{Pr}(|\hat{\beta} - \beta| < \epsilon) = 1 $ for all positive real $\epsilon$.

Consistency in the literal sense means that sampling the world will get us what we want. There are inconsistent minimum variance estimators (failing to find the famous example by Google at this point).

Unbiased minimum variance is a good starting place for thinking about estimators. Sometimes, it's easier to understand that we may have other criteria for "best" estimators. There is the general class of minimax estimators, and there are estimators that minimize MSE instead of variance (a little bit of bias in exchange for a whole lot less variance can be good). These estimators can be consistent because they asymptotically converge to the population estimates.

The interpretation of the slope parameter comes from the context of the data you've collected. For instance, if $Y$ is fasting blood gluclose and $X$ is the previous week's caloric intake, then the interpretation of $\beta$ in the linear model $E[Y|X] = \alpha + \beta X$ is an associated difference in fasting blood glucose comparing individuals differing by 1 kCal in weekly diet (it may make sense to standardize $X$ by a denominator of $2,000$.

That is what you consistently estimate with OLS, the more that $n$ increases.

WRT #2 Linear regression is a projection. The predictors we obtain from projecting the observed responses into the fitted space necessarily generates it's additive orthogonal error component. These errors are always 0 mean and independent of the fitted values in the sample data (their dot product sums to zero always). This holds regardless of homoscedasticity, normality, linearity, or any of the classical assumptions of regression models.