Singular Value Decomposition Theorem: Corollary

analysislinear algebrasvd

In a linear algebra and analysis course [it's a hybrid course between the two], we recently had the SVD (singular value decomposition) theorem, and the prof. told us (due to lack of time without proof):

Corollary 2.39: Let $A = U\Sigma V^{T}$ be the singular value decomposition of $A\in \mathbb R^{m\times n}$, where the singular values $\sigma_1 \geq \dots \geq \sigma_p \geq 0$, where $p = \min\left\{ m, n\right\}$. Further, for $k<p$, define

$$A_{k} = U\Sigma_{k}V^{T},$$

where $$\Sigma_{k} = \begin{pmatrix} \sigma_1 \qquad\qquad\qquad 0 \\ \qquad \ddots \qquad \\ \quad\quad\quad\sigma_{k} \\ 0 \qquad\qquad\qquad 0 \end{pmatrix} \in \mathbb R^{m\times n},$$

with $k < p$. It holds that: $$A_k = \arg\min_{\text{rank}\left( B \right) = k}\left|\left| A – B\right|\right|_{2} = \arg\min_{\text{rank}\left( B\right) = k}\left|\left| A – B\right|\right|_{F}.$$


Upon question on a hint of the proof the lecturer said that one might want to use the following relations: $$\text{Tr}\left( AB^{T} \right) \leq \sum_{j=1}^{p} \sigma_{j}\gamma_{j},$$ where $\gamma_{1} \geq \dots \geq \gamma_{p} \geq 0$ denote the singular values of $B\in\mathbb R^{m\times n}$. I also proved the following two relations:

$$\left|\left| A – A_{k} \right|\right|_{2} = \sigma_{k+1}, \qquad \left|\left| A – A_{k} \right|\right|_{F} = \left( \sum_{j = k+1}^{p} \sigma_{j}^{2} \right)^{1/2} \qquad $$


IDEA: I tried several things, one of them being: $$\left|\left| A – B\right|\right|_{2} \leq \left|\left| A – A_{k} \right|\right|_{2} + \left|\left| B – A_{k} \right|\right|_{2} = \sigma_{k+1} + \left|\left| B – A_{k} \right|\right|_{2}$$

But then, I am not sure how to continue. We already know from the Corollary that $A_k = U\Sigma_k V^{T}$ [by definition], and $B = \tilde{U}\tilde{\Sigma}\tilde{V^{T}}$ [by the SVD Theorem], i.e.
$\left|\left| B – A_{k} \right|\right|_{2} = \left|\left| \tilde{U}\tilde{\Sigma}\tilde{V^{T}} – U\Sigma_k V^{T}\right|\right|_{2}$.

Could anybody help out, please, on how to continue?

Best Answer

Operator norm:

You've shown $\|A-A_k\|_2 = \sigma_{k+1}$, so it remains to show that $\|A-B\|_2 \ge \sigma_{k+1}$ for any rank $k$ matrix $B$.

  • Suppose $B$ has rank $k$. The nullspace of $B$ has dimension $n-k$.
  • Let $v_1, \ldots, v_{k+1}$ be the top singular vectors of $A$ (i.e. $Av_i = \sigma_i u_i$). The span of these vectors has dimension $k+1$.
  • Since the above two subspaces have dimensions $n-k$ and $k+1$ summing to more than $n$, its intersection is nonempty. Let $z$ be a unit vector in both these subspaces.
  • Show that $$\|A-B\|_2^2 \ge \|(A-B)z\|_2^2 = \|Az\|_2^2 \ge \sigma_{k+1}^2.$$ (The first inequality is due to the definition of the operator norm. The second equality is due to $z \in \ker B$. The third inequality can be shown by writing $z$ as a linear combination of $v_1, \ldots, v_{k+1}$ and noting that these vectors map to $\sigma_i u_i$ for $i \ge k+1$.)

Frobenius norm:

You've shown $\|A-A_k\|_F^2 = \sum_{j=k+1}^p \sigma_j^2$ (note that you still have a typo), so it remains to show that $\|A-B\|_F^2 \ge \sum_{j=k+1}^p \sigma_j^2$ for any rank $k$ matrix $B$.

For a matrix $M$ I will use $\sigma_i(M)$ to denote the $i$th singular value of $M$, and let $M_k$ denote the result of keeping only the top $k$ singular values (just as you have defined $A_k$).

Using the result $\|M - M_k\| = \sigma_{k+1}$ that you proved, \begin{align} \sigma_i(A-B) &= \|(A-B) - (A-B)_{i-1}\|_2 \\ &= \|A - (B + (A-B)_{i-1})\|_2 \\ &\ge \|A - A_{k+i-1}\|_2 \\ &= \sigma_{k+i}(A). \end{align} The inequality is due to $\text{rank}(B + (A-B)_{i-1}) \le \text{rank}(B) + \text{rank}((A-B)_{i-1}) = k+i-1$, combined with the above result that $A_{k+i-1}$ is the best rank $k+i-1$ approximation of $A$ in the operator norm.

Using the fact that the squared Frobenius norm is the sum of squares of its singular values, we have $$\|A-B\|_F^2 = \sum_{i=1}^{\min(m,n)} (\sigma_i(A-B))^2 \ge\sum_{j=k+1}^p (\sigma_j(A))^2.$$