[Math] name for the matrix $X(X^tX)^{-1}X^{t}$

linear algebrastatistics

In my work, I have repeatedly stumbled across the matrix (with a generic matrix $X$ of dimensions $m\times n$ with $m>n$ given) $\Lambda=X(X^tX)^{-1}X^{t}$. It can be characterized by the following:

(1) If $v$ is in the span of the column vectors of $X$, then $\Lambda v=v$.

(2) If $v$ is orthogonal to the span of the column vectors of $X$, then $\Lambda v = 0$.

(we assume that $X$ has full rank).

I find this matrix neat, but for my work (in statistics) I need more intuition behind it. What does it mean in a probability context? We are deriving properties of linear regressions, where each row in $X$ is an observation.

Is this matrix known, and if so in what context (statistics would be optimal but if it is a celebrated operation in differential geometry, I'd be curious to hear as well)?

Best Answer

It is also called hat matrix. The idea is that this matrix "gives the hat": transforms the dependent variable to its prediction in linear regression.

The linear regression model is the following:

$$y=X\beta+\varepsilon.$$

The least squares estimate of the $\beta$ is defined as

$$\hat\beta=(X^TX)^{-1}X^Ty.$$

The prediction of the model is then:

$$\hat{y}=X\hat\beta=X(X^TX)^{-1}X^Ty$$

So we get that matrix $X(X^TX)^{-1}X^T$ transforms $y$ to $\hat{y}$, hence the hat matrix.