[Math] In least square method why the shortest solution $x^{+}$ is always in the row space of $A$

linear algebra

I'm studying least square method and I want to check that all I know is right or not.

If $Ax=b$ doesn't have a unique solution we have to use least square method.
The projection of $b$ onto the column space (call it $p$) is the nearest point $A\hat{x}$:
$p=A\hat{x}=Pb$ where $P$ is a projection matrix.

The reason we do that is this:
If $Ax=b$ doesn't have a unique solution,
it means $A$ has dependent rows or dependent columns.
If $A$ has dependent rows, $b$ is outside the column space and the minimum (least square solution) is the projection to the column space.
If $A$ has dependent columns, the $\hat{x}$ (least square solution) is not unique. So we have to choose the shortest, $x^{+}$.

The least square solution comes from the normal equations $A^{T}A\hat{x}=A^{T}b$.

Normal equation is formed this way:
All vectors perpendicular to the column space lie in the left nullspace.
The least square solution is the projection of $b$ onto the column space,
thus the error vector $e=b-A\hat{x}$ must be in the nullspace of $A^{T}$.
So we can write $A^{T}(b-A\hat{x})=0$ or $A^{T}A\hat{x}=A^{T}b$.

Okay, here is my question.
WHY any vextor $\hat{x}$ can be split into a rowspace component $x_r$ and a nullspace componen $x_n$: $\hat{x}=x_r+x_n$?
(Thus we claim that the shortest solution $x^{+}$ is always in the row space of $A$,
since $Ax_n=0$. It means the rowspace component also solves $A^{T}Ax_r=A^{T}b$.)

Best Answer

It comes from the fact that the rowspace and the nullspace are orthogonal complements. This immediately implies that for a real $n\times n$ matrix $A$ that $$\mathbb{R}^n = \ker(A) \oplus \operatorname{col}(A^\mathrm{T})$$ so that any $\mathbf{x}\in\mathbb{R}^n$ can be written uniquely as $\mathbf{x} = \mathbf{x_r} + \mathbf{x_n}$ where $\mathbf{x_n}\in\ker(A)$ and $\mathbf{x_r}\in\operatorname{col}(A^\mathrm{T})$.

If you require a proof of the fact that the nullspace and rowspace are orthogonal complements, pretty much every standard linear algebra text should contain such a proof.