Why is the Frobenius norm (energy) of a rectangular matrix $A$ equal to the trace of $A^T A$ or $AA^T$

linear algebramatricesnormed-spacestrace

Reading through a book, it is mentioned that $||{A}||^2_F=\text{Energy}(A)=\text{tr}(AA^T)=\text{tr}(A^TA)$. I understand that for square matrices the square Frobenius norm would be the the squared sum of all elements within the matrix, but I cannot intuitivley get why for rectangular matrices it would be the trace of the matrix multiplied by its transpose (or the other way around). For instance it would be that $\text{tr}(CD^T) = \text{tr}(DC^T) = \displaystyle\sum_{i=1}^n\sum_{j=1}^dc_{ij}d_{ij}$ for some matrices $D, C$. of size $n \times d$. Maybe some sort of proof would help?

SOURCE: Linear Algebra and Optimization for Machine Learning: A Textbook (page 20)

Best Answer

Note that: $$ \|A\|_F^2 = \sum_{j=1}^n\sum_{k=1}^d a_{jk} a_{jk} = \mathrm{tr}(A A^T), $$ using your definition of the trace (take $D = A, C = A$). But then since you note the trace is cyclic, we also have equality with $\mathrm{tr}(A^TA)$.