Differentiability of the Schatten $p$-norm on positive definite matrices

derivativeslinear algebrapositive-semidefinitereference-request

Let $V$ be the vector space of symmetric matrices in $\Bbb R^{n\times n}$. For $p\in (1,\infty)$, the Schatten $p$-norm of $M\in V$ is defined as $\|M\|_p =(\sum_{i=1}^n \sigma_i(M)^p)^{1/p}$ where $\sigma_1(M),\ldots,\sigma_n(M)$ are the singular values of $M$. Now, let $C\subset V$ be the cone of positive semi-definite matrices. It follows from this old post that
$$\nabla \|M\|_p = \|M\|_p^{1-p}M^{p-1}\qquad \forall M\in C.$$

Where is a reference for the above statement?

Best Answer

Let $M\in C$. Then $M$ is positive semi-definite, thus $|M|=M$ and $$\|M\|_p = \big(\operatorname{tr}M^p\big)^\frac 1p.$$ A scalar-by-matrix derivative using the chain rule gives $$\frac{\partial\;\;}{\partial M}\,\|M\|_p \:=\:\left.\frac{d\;}{dx}x^\frac 1p\right|_{x\,=\,\operatorname{tr}M^p} \cdot\frac{\partial\;\;}{\partial M}\,\operatorname{tr}M^p \\[3ex] \qquad\quad=\:\frac 1p\big(\operatorname{tr}M^p\big)^{\frac 1p-1}\cdot pM^{p-1} \\[3ex] =\:\|M\|_p^{1-p}M^{p-1}$$ This is a straightforward application of matrix calculus, and might be the reason why there is no quick $W^3$ hit for this derivation.

A general and helpful reference is this section .
For the derivative of $\,\operatorname{tr}(M^p)\,$ in the second term you may also look at page 155 here .