Linear Algebra – Difference Between Least Squares and Minimum Norm Solution

least squareslinear algebrasystems of equations

Consider a linear system of equations $Ax = b$.

  • If the system is overdetermined, the least squares (approximate) solution minimizes $||b – Ax||^2$. Some source sources also mention $||b – Ax||$.

  • If the system is underdetermined one can calculate the minimum norm solution. But it does also minimize $||b – Ax||$, or am I wrong?

But if least squares is also a minimum norm, what is the difference, or the rationale of the different naming?

Best Answer

Linear system $$ \mathbf{A} x = b $$ where $\mathbf{A}\in\mathbb{C}^{m\times n}_{\rho}$, and the data vector $b\in\mathbf{C}^{n}$.

Least squares problem

Provided that $b\notin\color{red}{\mathcal{N}\left( \mathbf{A}^{*}\right)}$, a least squares solution exists and is defined by $$ x_{LS} = \left\{ x\in\mathbb{C}^{n} \colon \lVert \mathbf{A} x - b \rVert_{2}^{2} \text{ is minimized} \right\} $$

Least squares solution

The minimizers are the affine set computed by $$ x_{LS} = \color{blue}{\mathbf{A}^{+} b} + \color{red}{ \left( \mathbf{I}_{n} - \mathbf{A}^{+} \mathbf{A} \right) y}, \quad y \in \mathbb{C}^{n} \tag{1} $$ where vectors are colored according to whether they reside in a $\color{blue}{range}$ space or $\color{red}{null}$ space.

The red dashed line is the set of the least squares minimizers.

hyperplane

Least squares solution of minimum norm

To find the minimizers of the minimum norm, the shortest solution vector, compute the length of the solution vectors.

$$ % \lVert x_{LS} \rVert_{2}^{2} = % \Big\lVert \color{blue}{\mathbf{A}^{+} b} + \color{red}{ \left( \mathbf{I}_{n} - \mathbf{A}^{+} \mathbf{A} \right) y} \Big\rVert_{2}^{2} % = % \Big\lVert \color{blue}{\mathbf{A}^{+} b} \Big\rVert_{2}^{2} + \Big\lVert \color{red}{ \left( \mathbf{I}_{n} - \mathbf{A}^{+} \mathbf{A} \right) y} \Big\rVert_{2}^{2} % $$ The $\color{blue}{range}$ space component is fixed, but we can control the $\color{red}{null}$ space vector. In fact, chose the vector $y$ which forces this term to $0$.

Therefore, the least squares solution of minimum norm is $$ \color{blue}{x_{LS}} = \color{blue}{\mathbf{A}^{+} b}. $$ This is the point where the red dashed line punctures the blue plane. The least squares solution of minimum length is the point in $\color{blue}{\mathcal{R}\left( \mathbf{A}^{*}\right)}$.

Full column rank

You ask about the case of full column rank where $n=\rho$. In this case, $$ \color{red}{\mathcal{N}\left( \mathbf{A} \right)} = \left\{ \mathbf{0} \right\}, $$ the null space is trivial. There is no null space component, and the least squares solution is a point.

In other words, $$ \color{blue}{x_{LS}} = \color{blue}{\mathbf{A}^{+} b} $$ is always the least squares solution of minimum norm. When the matrix has full column rank, there is no other component to the solution. When the matrix is column rank deficient, the least squares solution is a line.