Matrix derivative of the Frobenius norm of a product containing inverse

derivativeslinear algebramatrix-calculustrace

Let $A\in\mathbb{R^{n\times d}}$, $X\in\mathbb{R^{d\times d}}$, $d>n$. Let $A$ have rank $n$ and let $X$ be invertible. What is the derivative of $$\Vert XA^T(AXA^T)^{-1} – A^T(AA^T)^{-1}\Vert_F^2$$ with respect to $X$? Here, $\Vert A \Vert_F^2 = Tr(A^TA)$.

A step that would help with the above problem is whether it is possible to calculate the derivative of $$Tr(U(X)V(X))$$ with respect to X in terms of the derivatives of $Tr(U(X))$ and $Tr(V(X))$ with respect to X. Here U and V are matrix functions of X.

I found the "Scalar-by-matrix" section of https://en.wikipedia.org/wiki/Matrix_calculus useful in similar problems.

Best Answer

Let $\mathbf{C}= \mathbf{X} \mathbf{A}^T (\mathbf{A}\mathbf{X}\mathbf{A}^T)^{-1} - \mathbf{A}^T (\mathbf{A}\mathbf{A}^T)^{-1}$ and $\mathbf{D} = \mathbf{A}\mathbf{X}\mathbf{A}^T$

Using these notations, so that we can write $\phi = \| \mathbf{C} \|_F^2 = \mathbf{C}:\mathbf{C}$

It follows \begin{eqnarray} d\phi &=& 2 \mathbf{C}:d\mathbf{C} \\ &=& 2 \mathbf{C}:(d\mathbf{X}) \mathbf{A}^T \mathbf{D}^{-1} - 2 \mathbf{C}:\mathbf{X} \mathbf{A}^T \mathbf{D}^{-1}(d\mathbf{D})\mathbf{D}^{-1}\\ &=& 2 \mathbf{C}\mathbf{D}^{-T} \mathbf{A}:d\mathbf{X} - 2 \mathbf{D}^{-T}\mathbf{A}\mathbf{X}^T\mathbf{C} \mathbf{D}^{-T}: \mathbf{A}(d\mathbf{X})\mathbf{A}^T \end{eqnarray} Finally the gradient simplifies into $$ 2 (\mathbf{I} - \mathbf{A}^T \mathbf{D}^{-T}\mathbf{A}\mathbf{X}^T)\mathbf{C} \mathbf{D}^{-T} \mathbf{A} $$