[Math] Derivative of a trace

derivativeslinear algebramatricestrace

I'm new here, so "Hi" to everyone 😀

I got the following problem.
I have the matrices $A$, $B$, $C$, $X$ and $Y$. All matrices are square (say n-by-n).
In particular:
– $A$ is full rank
– $B$ is symmetric and (semi)definite positive;
– $C$ is diagonal and definite positive;
– $Y$ is diagonal and definite positive;
– $X$ is diagonal ($X = \operatorname{diag}\{x_1, \ldots,x_n\}$) and it is the unknown matrix;

Then I have the following function:
$f(X) = (A(B+X^{T}YX)^{-1}A^{T} + C)^{-1}$
(it may seem dumb to write $X^{T}$ since it is diagonal, but I think this is the best way to write it).

I would like to evaluate the derivative of the trace of $f(X)$ with respect to each $x_i$.

Any idea?

Best Answer

If we perturb an invertible matrix $M$ by a small $\Delta M$, the first-order change in $M^{-1}$ is given by $\Delta (M^{-1}) := (M+\Delta M)^{-1} - M^{-1} = -M^{-1} (\Delta M) M^{-1} + O(\|\Delta M\|^2)$. Now, consider $f(X) = (A(B+X^{T}YX)^{-1}A^{T} + C)^{-1}$. \begin{align} \Delta f(X) =&\Delta\left((A(B+X^{T}YX)^{-1}A^{T} + C)^{-1}\right)\\ \approx&-f(X)\ \Delta\left(A(B+X^{T}YX)^{-1}A^{T} + C\right) f(X)\\ =&-f(X)A \Delta\left((B+X^{T}YX)^{-1}\right) A^{T}f(X)\\ \approx&f(X)A(B+X^{T}YX)^{-1} \Delta\left(B+X^{T}YX\right) (B+X^{T}YX)^{-1}A^{T}f(X)\\ \approx&f(X)A(B+X^{T}YX)^{-1} \left((\Delta X)^{T}YX + X^TY\Delta X\right) (B+X^{T}YX)^{-1}A^{T}f(X). \end{align} Therefore \begin{align} \Delta\, \mathrm{trace}f(X) \approx&\mathrm{trace}\, f(X)A(B+X^{T}YX)^{-1} \left((\Delta X)^{T}YX + X^TY\Delta X\right) (B+X^{T}YX)^{-1}A^{T}f(X)\\ =&2\,\mathrm{trace}\, (\Delta X)^{T}YX (B+X^{T}YX)^{-1}A^{T}f(X)^2A(B+X^{T}YX)^{-1} \end{align} and in turn $$ \frac{d\mathrm{trace}f(X)}{dX} = 2YX (B+X^{T}YX)^{-1}A^{T}f(X)^2A(B+X^{T}YX)^{-1}. $$ This is the formula for a general square matrix $X$. For a diagonal $X$, simply take the diagonal of the above derivative.

Related Question