[Math] Difference between orthogonal projection and least squares solution

least squareslinear algebravector-spacesvectors

When you find the least squares solution you solve $$A^TA = A\vec b$$ but to find the orthogonal projection into the "subspace" A, you multiply this result (the least squares solution) with the original matrix. Why is this?

If you use the analogy with the light shining orthogonally on to the subspace and the orthogonal projection is the shadow in the subspace, isn't this shadow also the least squares solution?

Best Answer

A least squares solution is not the shadow you refer to in the shining light analogy. This shadow is the orthogonal projection of $b$ onto the column space of $A$, and it is unique. Call this projection $p$. A least squares solution of $Ax = b$ is a vector $x$ such that $Ax = p$. The vector $x$ need not be unique.

Consider the matrix

$$A = \begin{bmatrix} 4 &8 \\ 6 &12 \end{bmatrix}$$ and the vector

$$b = \begin{bmatrix} 5 \\ 1 \end{bmatrix} $$

which is not in $C(A) = \textrm{span} \left( \begin{bmatrix} 4 \\ 6 \end{bmatrix} \right)$. The orthogonal projection of $b$ onto $C(A)$ is given by

$$ p = \frac{\begin{bmatrix} 4 \\ 6 \end{bmatrix} \cdot \begin{bmatrix} 5 \\ 1 \end{bmatrix}}{\begin{bmatrix} 4 \\ 6 \end{bmatrix} \cdot \begin{bmatrix} 4 \\ 6 \end{bmatrix}} \begin{bmatrix} 4 \\ 6 \end{bmatrix} = \begin{bmatrix} 2 \\ 3 \end{bmatrix}$$

A least squares solution of $Ax = b$ is a vector $x$ such that

$$Ax = p$$

This system has infinitely many solutions. The solution set is

$$x = \left\{ t \begin{bmatrix} -2 \\ 1 \end{bmatrix} + \begin{bmatrix} \frac{1}{2} \\ 0 \end{bmatrix}, t \in \mathbb{R} \right\} $$

Therefore, both $x_1 = \begin{bmatrix} \frac{1}{2} \\ 0 \end{bmatrix}$ and $x_2 = \begin{bmatrix} -\frac{7}{2} \\ 2 \end{bmatrix}$, for instance, are least squares solutions, because both $Ax_1 = p$ and $Ax_2 = p$. But neither of these solutions is the "shadow" you refer to in the shining light analogy. Rather, $p$ is the shadow, and $x_1$ and $x_2$ are simply vectors you could multiply $A$ by to get $p$.

enter image description here

Related Question