$SVD$ Exercise of Watkins’ Fundamental of Matrix Computations

matrix decompositionnumerical linear algebrasvd

Apologies that this turned out a little long.

I have studied the Singular Value Decomposition in Fundamental of Matrix Computations by Watkins and I couldn't to solve the Exercise 4. 1. 16 which is a demonstration of the following theorem:

(Watkins) Theorem 4.1.1 (SVD Theorem) Let $A\in \mathbb{R}^{n \times m}$ be a nonzero matrix with rank $r$. Then $A$ can be expressed as a product
\begin{equation} A = U \Sigma V^{T} \end{equation}
where $U \in \mathbb{R}^{n \times n}$ and $V \in \mathbb{R}^{m \times m}$ are orthogonal, and $\Sigma \in \mathbb{R}^{n \times m}$ is a rectangular "diagonal" matrix
\begin{equation} \Sigma =
\left(
\begin{array}{cccccc}
{\sigma_{1}} & {} & {} & {} & {} & {}\\
{ } & {\sigma_2} & {} & {} & {} & {}\\
{} & {} & {\ddots} & {} & {} & {}\\
{} & {} & {} & {\sigma_r} & { } & {} \\
{} & {} & {} & {} & { } & {\ddots}
\end{array}
\right) \;\;\; \sigma_{1}\geq \sigma_2 \geq \cdots\geq \sigma_r > 0.
\end{equation}

The Exercise is:

(Watkins) Exercise 4.1.16 In this exercise you will prove Theorem 4.1.1 by induction on $r$, the rank of $A$.

(a) Suppose $A \in \mathbb{R}^{n \times m}$ has rank 1. Let $u_1 \in \mathbb{R}^{n}$ be a vector in $R(A)$ such that $|| u_1 ||_2 = 1$. Show that every column of $A$ is a multiple of $u_1$. Show that $A$ can be written in the form $ A = \sigma_1 u_1 v_{1}^{T}$, where $v_1 \in \mathbb{R}^{m}$, $|| u_1 ||_2 = 1$, and $\sigma_1 > 0$.

(b) Continuing from part (a), demonstrate that there is an orthogonal matrix $U \in \mathbb{R}^{n \times n}$ whose first column is $u_1$. (For example, $U$ can be a reflector that maps the unit vector $e_1$ to $u_1$.) Similarly there is an orthogonal $V \in \mathbb{R}^{m \times m}$ whose first column is $v_1$. Show that $A = U \Sigma V^{T}$, where $\Sigma \in \mathbb{R}^{n \times m}$ has only one nonzero entry, $\sigma_1$, in position $(1,1)$. Thus every matrix of rank 1 has an $SVD$.

(c) Now suppose $A \in \mathbb{R}^{n \times m}$ has rank $r > 1$. Let $v_1$ be a unit vector in the direction of maximum magnification by $A$, i.e., $|| v_1 ||_2 = 1$, and $|| A v_1 ||_2 = \max_{|| v ||_{2} = 1} || Av||_2$. Let $\sigma_{1}=\left\|A v_{1}\right\|_{2}=\|A\|_{2}$, and let $u_1 = \sigma_{1}^{-1}Av_1$. Let $\tilde{U} \in \mathbb{R}^{n \times n}$ and $\tilde{V} \in \mathbb{R}^{m \times m}$ be orthogonal matrices with first column $u_1$ and $v_1$, respectively. Let $\tilde{A} = \tilde{U}^{T} A \tilde{V}$, so that $A = \tilde{U} \tilde{A} \tilde{V}^{T}$. Show that $\tilde{A}$ has the form
\begin{equation}
\tilde{A}=\left[ \begin{array}{cc}{\sigma_{1}} & {z^{T}} \\ {0} & {\hat{A}}\end{array}\right],
\end{equation}

where $z \in \mathbb{R}^{m-1}$ and $\hat{A} \in \mathbb{R}^{(n-1) \times (m-1)}$.

(d) Show that in matrix $\tilde{A}$ above the vector $z$ must be zero. You may do this as follows: Let $w=\left[ \begin{array}{c}{\sigma_{1}} \\ {z}\end{array}\right] \in \mathbb{R}^{m}$. Show that $\|\tilde{A} w\|_{2} /\|w\|_{2} \geq \sqrt{\sigma_{1}^{2}+z^{T} z}$. Then show that this inequality forces $z = 0$. Thus
\begin{equation}
\tilde{A}=\left[ \begin{array}{cc}{\sigma_{1}} & {0} \\ {0} & {\hat{A}}\end{array}\right].
\end{equation}

(e) Show that $\hat{A}$ has rank $r-1$. By the induction hypothesis $\hat{A}$ has an SVD $\hat{A} = \hat{U} \hat{\Sigma} \hat{V}^{T}$. Let $\sigma_2 \geq \sigma_3 \geq \cdots \sigma_r$ denote the positive main-diagonal entries of $\hat{\Sigma}$. Show that $\sigma_1 \geq \sigma_2$. Embed the SVD $\hat{A} = \hat{U} \hat{\Sigma} \hat{V}^{T}$ in larger matrices to obtain an SVD of $A$. Then use the equation $A = \tilde{U} \tilde{A} \tilde{V}^{T}$ to obtain an SVD of $A$.


Question:

I know the exercise is well done (it's a step by step that I should probably complete the demonstration easily). But I could not do it and it also did not help me what I read in the book of Strang and Golub that I took to help me solve this exercise.
So, I'd like to understand the solutions this Theorem/Exercise.

Best Answer

Ok, suppose that $A = U \Sigma V^{T}$ . First of all consider how the SVD is calculated. The construction of the SVD uses the following idea

We get the left singular vectors $U$ by computing the covariance matrix

$$ AA^{T} = U \Sigma V^{T}( U \Sigma V^{T})^{T} = U \Sigma V^{T} (V \Sigma^{T} U^{T}) $$

now note that $V^{T}V = I$ so we have

$$ U \Sigma V^{T} (V \Sigma^{T} U^{T}) = U \Sigma \Sigma^{T} U^{T} $$

note that $\Sigma$ is a diagonal matrix and $ \sigma_{jj}^{T} = \sigma_{jj}$ so

$$ U \Sigma \Sigma^{T} U^{T} = U \Sigma \Sigma U^{T}$$

now the singular values $\sigma_{i}^{2} = \lambda_{i}$ so you can write

$$ U \Lambda^{\frac{1}{2}} \Lambda^{\frac{1}{2}} U^{T} = U \Lambda U^{T} $$

Similarly, the right singular vectors $V$ are given as

$$ A^{T}A = (U \Sigma V^{T})^T (U \Sigma V^{T}) = V \Lambda V^{T} $$

Note that you can write this like

$$ V^{T} A^{T}A V = \begin{pmatrix} v_{1}^{T} \\ v_{2}^{T} \\ \vdots \\ v_{m}^{T} \end{pmatrix} \left( A^{T}A v_{1}, A^{T}A v_{2} , \cdots A^{T} A v_{m} \right) = \begin{pmatrix} v_{1}^{T} \\ v_{2}^{T} \\ \vdots \\ v_{m}^{T} \end{pmatrix} \left( \lambda_{1} v_{1}, A^{T}A v_{2} , \cdots A^{T} A v_{m} \right) = \begin{pmatrix} \lambda_{1} & \textrm{z}^{*} \\ 0 & \textrm{B} \end{pmatrix} $$

where by induction we have $ B = \hat{V}{S} \hat{V}^{T}$

$$ V^{T} A^{T}A V = \begin{pmatrix} \lambda_{1} & z^{*} \\ 0 & \hat{V}{S} \hat{V}^{T}\end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & \hat{V} \end{pmatrix} \begin{pmatrix} \lambda & z^{*} \\ 0 & S \end{pmatrix} \begin{pmatrix} 1 & 0 \\ 0 & \hat{V} \end{pmatrix}^{T} $$

We use the same idea for $\tilde{A} = \tilde{U}^{T} A \tilde{V}$

$$ \tilde{U}^{T} A \tilde{V} = \begin{pmatrix} \tilde{u}_{1}^{T} \\ \tilde{u}_{2}^{T} \\ \vdots \\ \tilde{u}_{m}^{T} \end{pmatrix} \left(A \tilde{v}_{1} , A \tilde{v}_{2} , \cdots A \tilde{v}_{m} \right) \\ = \begin{pmatrix} \tilde{u}_{1}^{T} \\ \tilde{u}_{2}^{T} \\ \vdots \\ \tilde{u}_{m}^{T} \end{pmatrix} \left(\sigma \tilde{v}_{1} , A \tilde{v}_{2} , \cdots A \tilde{v}_{m} \right) = \begin{pmatrix} \sigma_{1} & z^{*} \\ 0 & \hat{A} \end{pmatrix} $$

with $\hat{A} = \hat{U} \hat{\Sigma} \hat{V}^{T}$ we have

$$ \begin{pmatrix} 1 & 0 \\ 0 & \hat{U} \end{pmatrix} \begin{pmatrix} \sigma_{1} & z^{*} \\ 0 & \hat{A} \end{pmatrix} \begin{pmatrix} 1 & 0 \\ 0 & \hat{V} \end{pmatrix}^{T} $$

The point is to continue along the lower part of the diagonal and prove $z^{*}$ is equal to $0$.

$$ \bigg\| \begin{bmatrix} \sigma_{1} & z^{*} \\ 0 & \hat{A} \end{bmatrix} \begin{bmatrix} \sigma_{1} \\ z \end{bmatrix} \bigg\|_{2} \geq \sigma_{1}^{2} + z^{*} z = \left( \sigma_{1}^{2} + z^{*}z\right)^{\frac{1}{2}} \bigg\| \begin{bmatrix} \sigma_{1} \\ z \end{bmatrix} \bigg\|_{2} $$

Now, note we're working with $\| \tilde{A}\| = \| \tilde{U}^{T} A \tilde{V} \|$ and $\| \tilde{A}\|_{2} \geq \left( \sigma_{1} + z^{*}z \right)^{\frac{1}{2}} $. Since $\| \tilde{A}\| = \| \tilde{U}^{T} A \tilde{V} \|$ we know that $\|\tilde{A}\| = \|A \| = \sigma_{1}$ which forces $z^{*} = 0$

Once you have done this we then have

$$ A = \tilde{U} \tilde{A} \tilde{V}^{T} = \begin{pmatrix} 1 & 0 \\ 0 & \hat{U} \end{pmatrix} \begin{pmatrix} \sigma_{1} & 0 \\ 0 & \hat{A} \end{pmatrix} \begin{pmatrix} 1 & 0 \\ 0 & \hat{V} \end{pmatrix}^{T} $$

Since this gives the form for a general matrix $A \in \mathbb{R}^{n \times m}$ and we have that $\hat{A} \in \mathbb{R}^{ (n -1) \times (m-1)}$ we know it has a SVD just like that...

I'm pretty sure I've messed up the $\hat{A}, \tilde{A}$ somewhere along here..