Solved – difference between linear projection and linear regression (OLS)

projectionregression

In http://www.wouterdenhaan.com/numerical/slidesbayesian.pdf (approximately from page 7 to 13), ordinary least squares and linear projection are said to be different. But from my linear algebra class, I remember hearing that OLS is indeed projection method. So I am confused here.

What exactly is the difference between these two?

Best Answer

OLS, conditional expectation and linear projection are all related. It helps to distinguish between the unknown data generating process (the model) and procedures to estimate the parameters of that model.

Let this be model/data generating process. $f$ is some unknown function.

$y_i = f(x_i, \theta) +\epsilon_i$, $E[x_i\epsilon]=0$

We could use OLS, and regress $y_i$ on vector $x_i$. The OLS estimator is defined to be the vector $b$ that minimises the sample sum of squares $(y-Xb)^T(y-Xb)$ ( $y$ is $n \times 1$, $X$ is $n \times k$ ).

As the sample size $n$ gets larger, $b$ will converge to something (in probability). Whether it converges to $\beta$, though, depends on what the true model/dgp actually is, ie on $f$.

Suppose $f$ really is linear. Then $y_i = x_i^T\theta +\epsilon_i$ and $E[y_i|x_i]=x_i^T\theta$ and $b$ converges to $\theta$.

What if $f$ isn't linear? $b$ still converges to something, the thing it always converges to: the linear projection coefficient. What is a linear projection? Is is the population equivalent of the OLS estimator. The vector $\beta$ that minimises $E[ (y_i-x_i^T\beta)^T (y_i-x_i^T\beta)]$. Regardless of what the true relation between y and x is, this vector exists and OLS converges to it.

In the special case where the conditional expectation is linear, $\theta$ and $\beta$ are the same, and OLS recovers the conditional expectation function for you as the sample grows. If that function is not linear, OLS recovers just the linear projection coefficient for you, which could still be useful, because it is the mean square error minimising linear approximation of the conditional expectation function.