The following excerpt is from chapter 2.8 of Deep Learning, by Goodfellow, Bengio, and Courville:
The full section (for context) is as follows:
I would greatly appreciate it if people could please take the time to give a full explanation of the excerpt (not the entire section). The following lists what I don't understand from the excerpt:
- What is meant by "eigendecomposition of functions of $\textbf{A}$? I understand what eigendecomposition is, but I don't understand what is meant when referring to functions of them?
- Related to the above, why can we interpret the singular value decomposition of $\mathbf{A}$ in terms of the eigendecomposition of functions of $\mathbf{A}$?
- Why are the left-singular vectors of $\mathbf{A}$ the eigenvectors of $\mathbf{A}\mathbf{A}^T$?
- Why are the right-singular vectors of $\mathbf{A}$ the eigenvectors of $\mathbf{A}^T\mathbf{A}$?
- Why are the nonzero singular values of $\mathbf{A}$ the square roots of the eigenvalues of $\mathbf{A}^T\mathbf{A}$?
- Why is the same true for $\mathbf{A}\mathbf{A}^T$?
Best Answer
1. $AA^T$ and $A^TA$ depend on $A$ and can be seen as functions of $A$.
2. See below.
3. and 6. If $A=UDV^{-1}$ with $U$ and $V$ being orthogonal matrices (their inverses equal their transposes), then $$AA^T = UDV^{-1}(UDV^{-1})^T = UDV^{-1}(V^{-1})^TD^TU^T = UDV^{-1}VDU^T = U\text{diag}(\mathbf{\lambda}^2)U^{-1}$$ (using that $U$ and $V$ are orthogonal and $D$ is symmetric). This is an eigendecomposition of $AA^T$, where columns of $U$ are the eigenvectors and squares of singular values of $A$ are the eigenvalues of $AA^T$.
4 and 5. This is similar.