[Math] Why does SVD provide the least squares and least norm solution to $ A x = b $

least squareslinear algebraoptimizationsvd

I am studying the Singular Value Decomposition and its properties. It is widely used in order to solve equations of the form $Ax=b$. I have seen the following: When we have the equation system $Ax=b$, we calculate the SVD of A as $A=U\Sigma V^T$. Then we calculate $x'= V \Sigma^{+}U^Tb$. $\Sigma^{+}$ has the reciprocals ($\dfrac{1}{\sigma_i}$) of the singular values in its diagonal and zeros where $\sigma_i=0$. If the $b$ is in the range of $A$ then it is the solution that has the minimum norm (closest to origin). If it is not in the range, then it is the least-squares solution.

I fail to see how exactly this procedure always produces a $x'$ which is closest to origin if $b$ is in the range of A. (I can see the least-squares solution is an extension of this "closest to origin" property). From a geometric intuitive way if possible, how can we show this property of SVD?

Best Answer

First, consider the problem $\Sigma x = b$, where $$ \Sigma = \pmatrix{\sigma_1\\& \ddots\\&&\sigma_r\\ &&&0\\&&&&\ddots\\&&&&&0} $$ Note that $b$ is only in the range of $\Sigma$ if its entries $b_{r+1},\dots,b_n$ are all zero. Furthermore, you should be able to convince yourself (geometrically or otherwise) that the least squares solution must be $$ x = (b_1/\sigma_1,\dots,b_r/\sigma_r,0,\dots,0)^T = \Sigma^+ b $$ From there, note that $$ U\Sigma V^T x = b \implies\\ \Sigma (V^T x ) = U^T b $$ By the above argument, the least squares solution for $(V^T x)$ is given by $V^T x = \Sigma^+ U^T b$. Noting that $\|V^T x\| = \|x\|$, we can use this to conclude that $x = (V \Sigma ^+ U^T)b$ must be the least squares solution (for $x$).

I hope you find this explanation sufficient.