Gradient of $M \mapsto \mbox{tr} \left( A^{-1} M \right) + \mbox{tr} \left( M^{-1} B \right)$

derivativesmatricesmatrix-calculusscalar-fieldstrace

Given positive definite matrices $A$ and $B$, let $$f(M) := \mbox{tr} \left( A^{-1} M \right) + \mbox{tr} \left( M^{-1} B \right)$$ What is $\nabla f(N)$?


According to my source, $\nabla f(N)=A-N^{-1}BN^{-1}$. However, I would expect a vector. How can I compute the trace gradient and what is the shape of its value?

Best Answer

Since $f$ eats a matrix and spits out a number, its derivative $\nabla f(N)$ at $N$ eats a matrix direction and spits out a number. How you organise that information is up to you.

For $N$ invertible and $H$ small enough, we have $N+H$ is invertible with \begin{align*} (N+H)^{-1}&=[N(1+N^{-1}H)]^{-1}\\ &=(I+N^{-1}H)^{-1}N^{-1}\\ &=(I-N^{-1}H+o(\lVert H\rVert))N^{-1}\\ &=N^{-1}-N^{-1}HN^{-1}+o(\lVert H\rVert) \end{align*} So \begin{align*} f(N+H)&=\operatorname{tr}(A^{-1}(N+H))+\operatorname{tr}((N+H)^{-1}B)\\ &=f(N)+\operatorname{tr}(A^{-1}H)-\operatorname{tr}(N^{-1}HN^{-1}B)+o(\lVert H\rVert)\\ &=f(N)+\operatorname{tr}(A^{-1}H)-\operatorname{tr}(N^{-1}BN^{-1}H)+o(\lVert H\rVert)\\ &=f(N)+\langle (A^{-1}-N^{-1}BN^{-1})^T,H\rangle_{F}+o(\lVert H\rVert) \end{align*} which gives $$ \nabla f(N)(H)=\langle (A^{-1}-N^{-1}BN^{-1})^T,H\rangle_{F} $$ where $\langle-,-\rangle_F$ is the Frobenius inner product.

Related Question