Derivative of matrix nuclear norm

derivativeslinear algebramatrix-calculusnuclear norm

I'm trying to find the derivative of

$$|(L^TL – \sigma)|_1 = \mbox{Tr} \left( \sqrt{(L^TL – \sigma)^\dagger(L^TL – \sigma)} \right)$$

with respect to $L$, where $\dagger$ is the transpose conjugate and $\sigma$ is some matrix.

I tried doing this with differentials and ended up at
$$\begin{align}
&\partial\text{Tr}\left(\sqrt{(L^TL – \sigma)^\dagger(L^TL – \sigma)}\right) \\
&= \left(\frac{1}{2\sqrt{(L^TL – \sigma)^\dagger(L^TL – \sigma)}}\right)^T:\left(dX^\dagger (X – \sigma) + (X – 1)^\dagger dX\right)
\end{align}$$

where $X = L^TL$. This doesn't look too promising as I eventually only want $dL$ terms. Could someone point out how to proceed? Thank you.

Best Answer

Define $$\eqalign{ M &= L^TL-\Sigma,\quad S &= \big(M^TM\big)^{1/2},\quad \phi &= \|M\|_* = {\rm Tr}(S) }$$ Then, assuming all the matrices are real $$\eqalign{ \frac{\partial\phi}{\partial L} &= LMS^{-1}+LS^{-1}M^T \cr }$$

The detailed calculations follow. $$\eqalign{ d\phi &= M(M^TM)^{-1/2}:dM \cr &= MS^{-1}:(L^TdL + dL^TL) \cr &= \big(MS^{-1}+S^{-1}M^T\big):L^TdL \cr &= L\big(MS^{-1}+S^{-1}M^T\big):dL \cr \frac{\partial\phi}{\partial L} &= L\big(MS^{-1}+S^{-1}M^T\big) \cr }$$ where a colon represents the trace/Frobenius product, i.e. $$\eqalign{A:B = {\rm Tr}(A^TB)\cr}$$


Update

If all the matrices are complex, and Wirtinger derivatives are acceptable to you, then $$\eqalign{ M &= L^\dagger L-\Sigma,\quad S = \big(M^\dagger M\big)^{1/2},\quad \phi = {\rm Tr}(S) \cr \frac{\partial\phi}{\partial L} &= \tfrac{1}{2}L^*M^*S^{-1} \cr }$$ If $L$ is real (i.e. $L=L^*,\, L^\dagger=L^T$), and all the others are complex then $$\eqalign{ \frac{\partial\phi}{\partial L} &= \tfrac{1}{2}L\big(M^*S^{-1}+(S^*)^{-1}M^\dagger\big) \cr }$$

Related Question