[Math] Matrix derivative formula using the matrix chain rule

matrix-calculus

Let $X \in \mathbb{C}^{m \times n}$ be a matrix. Let $F(X) \in \mathbb{C}^{m \times m}$ be a matrix, function of $X$, e.g. $F(X) = I_m + X X^{\dagger}$, where $^\dagger$ means conjugate-transpose and $I_m$ is the identity matrix of dimension $m$. Finally, let $\mathbf{g}(X)$ be a (column-)vector-valued function of $X$, e.g. $\mathbf{g}(X) = u – Xv$, with $u,v$ column-vectors of appropriate dimensions. Then,
$$
Q(X) = \mathbf{g}(X)^\dagger F(X) \mathbf{g}(X)
$$
is clearly a scalar. What I want to find is a formula for
$$
\frac{\partial \mathbf{g}(X)^\dagger F(X) \mathbf{g}(X)}{\partial X} = \ ?
$$


Edit: By Leibniz's rule,
$$
\frac{\partial Q(X)}{\partial X} = \frac{\partial \mathbf{g}^{\dagger}(X)}{\partial X} F(X) \mathbf{g}(X) + \mathbf{g}^{\dagger}(X) \frac{\partial F(X)}{\partial X} \mathbf{g}(X) + \mathbf{g}^{\dagger}(X) F(X) \frac{\partial \mathbf{g}(X)}{\partial X}
$$

Best Answer

To begin with, as discussed in the comments, one should understand what $dX$ in the denominator means. The space of matrices is a vector space, and so, all maps in question are multi-variable maps. Hence, every map from the space of matrices to another space has a differential which can be thought of as a bunch of partial derivatives. In other words, describing the differential of such a map is equivalent to specifying all the partial derivatives.

So, let $e_i$ be a basis of the space of matrices, and let $\frac{\partial}{\partial x^i}$ denote the directional derivative in the $e_i$ direction. By the Leibniz rule,$$\frac{\partial}{\partial x^i}(f_1\cdot\ldots\cdot f_k)=\frac{\partial f_1}{\partial x^i}f_2\ldots f_k+\ldots+f_1\ldots f_{k-1}\frac{\partial f_k}{\partial x^i}.$$ Note that if the $f$'s are matrix-valued (and they are in your example), then you can't change the order in the above equation, as $AB\neq BA$ for general matrices. Taking transpose and/or conjugation commutes with differentiating, and so, transpose and $\dagger$ simply carry through.