Gradient of scalar field $a^T X^{-1} b$

derivativesinversematricesmatrix-calculusscalar-fields

During the derivation of GDA as generative algorithm, I am stuck at how to take the gradient

$$\nabla_X \left( a^TX^{-1}b \right)$$

where $a, b$ are column vectors independent of $X$.

I have tried using trace operator and chain rule, but could not crack it. How should this derivative be approached?


The answer is

$$-X^{-T}ab^TX^{-T}$$

Best Answer

Let's use a colon to denote the trace/Frobenius product, i.e. $$A:B = {\rm Tr}(A^TB)$$ Use the Frobenius product to write the function. Then find its differential and gradient. $$\eqalign{ \phi &= a^TX^{-1}b = a:X^{-1}b \cr &= ab^T:X^{-1} \cr d\phi &= ab^T:dX^{-1} = ab^T:(-X^{-1}\,dX\,X^{-1}) \cr &= -X^{-T}ab^TX^{-T}:dX \cr \frac{\partial \phi}{\partial X} &= -X^{-T}ab^TX^{-T} \cr }$$ NB:
The cyclic property of the trace allows terms in a Frobenius product to be rearranged, e.g. $$\eqalign{ A:BC &= B^TA:C = AC^T:B }$$ The differential of $X^{-1}$ is obtained from the differential of its defining property. $$\eqalign{ I &= X^{-1}X \cr dI &= dX^{-1}X+X^{-1}dX \cr 0 &= dX^{-1}+X^{-1}dX\,X^{-1} \cr dX^{-1} &= -X^{-1}dX\,X^{-1} \cr }$$

Related Question