SVD – Understanding a Derivation of the Singular Value Decomposition

svd

Here's an attempt to motivate the SVD. Let $A \in \mathbb R^{m \times n}$. It's natural to ask, in what direction does $A$ have the most "impact".
In other words, for which unit vector $v$ is $\| A v \|_2$ the largest?
Denote this unit vector as $v_1$. Let $\sigma_1 = \| A v_1 \|_2$, and define $u_1$ by $A v_1 = \sigma_1 u_1$.

Next, we would like to know in what direction orthogonal to $v_1$ does $A$ have the most "impact"? In other words, for which unit vector $v \perp v_1$ is
$\| A v \|_2$ the largest? Denote this unit vector as $v_2$. Let $\sigma_2 = \| A v_2 \|_2$, and define $u_2$ by
$A v_2 = \sigma_2 u_2$.

Question: Are the vectors $u_1$ and $u_2$ guaranteed to be orthogonal? If so, is there an easy proof for this fact, or a viewpoint that makes this obvious?

Best Answer

if $$\max_{\|v\|=1} \|A v\|$$ has $v_1$ as solution, then for every $v \perp v_1$ : $Av \perp A v_1$.

suppose by contradiction that it exists $v_2 \perp v_1$ such that $Av_2 = c A v_1 + u_2$ where $c\ne 0$ and $u_2 \perp A v_1$. then $v_1$ can't be a maximiser of $\max_{\|v\|=1} \|A v\|$ :

let $w(\epsilon) = \sqrt{1-\epsilon^2} v_1 + \epsilon v_2$, hence $\|w(\epsilon)\| = 1$, and $$A w(\epsilon) = \sqrt{1-\epsilon^2} A v_1 + \epsilon A v_2 = (\sqrt{1-\epsilon^2} + \epsilon c) A v_1 + \epsilon u_2$$

i.e. : $$\|A w(\epsilon)\|^2 = \underbrace{|\sqrt{1-\epsilon^2} + \epsilon c|^2}_{>\ 1 \ \text{if } \ \epsilon \ \text{ is small enough }} \|A v_1\|^2+ \epsilon^2 \|u_2\|^2 $$

this is enough to prove the SVD of matrices, since we can repeatedly compute $\max_{\|v\|=1} \|A_k v\|$ on $A_1 = A$ and then on $A_{k+1} = A_{k} - A_{k}v_{k} v_{k}^T$ where $v_{k}$ is the maximiser of the previous maximisation, and hence this is enough to prove the spectral theorem too.

Related Question