[Math] How does one prove the solution of minimum Euclidean Norm to the least squares problem

linear algebranumerical linear algebra

If we have some $m \times n$ matrix $A$ with an $m$-vector $b$, how does one prove that the solution $x$ of the minimum Euclidean norm to the least squares problem $Ax \approx b$ is given by

$$
x = \sum_{\sigma_i \neq 0} \frac{u_i^Tb}{\sigma_i}v_i
$$

where the $\sigma_i$, $u_i$, and $v_i$ are the singular values and corresponding singular vectors of $A$.

Best Answer

If $A=\sum\limits_{i=1}^r\sigma_i u_iv_i^T$ is the SVD of $A$ (where $\sigma_i\neq 0$, $i=1,\ldots,r$), any vector $b$ can be expressed as a linear combination of $u_i$'s plus a term which is orthogonal to them, that is, $$ b=\sum_{i=1}^r\beta_iu_i+c, $$ where $\beta_i=u_i^Tb$ and $u_i^Tc=0$ for $i=1,\ldots,r$. Hence the residual $b-Ax$ is given by $$ b-Ax=c+\sum_{i=1}^r\left[\beta_iu_i-\sigma_i (v_i^Tx) u_i\right]=c+\sum_{i=1}^r\left[u_i^Tb-\sigma_i (v_i^Tx)\right]u_i. $$ Since the summands are mutually orthogonal, we can use the Pythagorean theorem to compute the norm of $b-Ax$ as $$ \|b-Ax\|_2^2=\|c\|_2^2+\sum_{i=1}^r\left[u_i^Tb-\sigma_i (v_i^Tx)\right]^2. $$ The first term is constant and, therefore, the residual norm is minimal if and only if $$\tag{1}v_i^Tx=\frac{u_i^Tb}{\sigma_i}, \quad i=1,\ldots,r.$$ Now any $x$ can be similarly to $b$ expressed as a linear combination of $v_i$'s plus a term orthogonal to these vectors: $$ x=\sum_{i=1}^r\xi_iv_i+y, $$ where $\xi_i=v_i^Tx$ and $v_i^Ty=0$ for $i=1,\ldots,r$. Hence, using (1), any $x$ of the form $$\tag{2} x=\sum_{i=1}^r\frac{u_i^Tb}{\sigma_i}v_i+y $$ minimizes the Euclidean norm of the residual $b-Ax$. There can be an infinite number of vectors $x$ minimizing the residual norm if the orthogonal complement of $\mathrm{span}(v_i)_{i=1}^r$ is nontrivial (that is, $A$ has a nontrivial kernel). Since $$ \|x\|_2^2=\sum_{i=1}^r\left(\frac{u_i^Tb}{\sigma_i}\right)^2+\|y\|_2^2, $$ we can see that setting $y=0$ in (2) leads to the solution, which, in addition to minimizing the residual, has the minimal norm.