[Math] Derivative of a Matrix w.r.t. a Matrix

calculuslinear algebramatrices

I have a matrix product with $\mathbf{X} \in \mathbb{R}^{m\times n}$ as $\mathbf{F(X)} = \mathbf{XAA}^T$ where $\mathbf{A}$ is a constant matrix w.r.t. $\mathbf{X}$. I see that I can write the following according to Wikipedia.
$$
d\mathbf{F(X)} = (d\mathbf{X})\mathbf{AA}^T + \mathbf{X}d(\mathbf{AA}^T) = (d\mathbf{X})\mathbf{AA}^T
$$

From here, can I write,
$$
\frac{d\mathbf{F(X)}}{d\mathbf{X}} = \mathbf{I}_{m\times n}\mathbf{AA}^T = \mathbf{AA}^T
$$

Note that I have taken the help of the fact that the derivative of an ${m\times n}$ matrix $\mathbf{A}$ with respect to itself is $\mathbf{I}_{m\times n}$, as found in page 4 of the Notes on Matrix Calculus by Paul L. Fackler. I'm not sure what exactly $\mathbf{I}_{m\times n}$ is, but I'm taking it as some sort of generalized identity matrix and assuming that premultiplying $\mathbf{AA}^T$ with $\mathbf{I}_{m\times n}$ results in $\mathbf{AA}^T$ only.

So, in short, my question is can I write $\frac{d\mathbf{F(X)}}{d\mathbf{X}}$ as $\mathbf{AA}^T$, in this case?

Best Answer

The differential is correct $$\eqalign{ dF &= dX\,AA^T \cr &= I\,dX\,AA^T\cr }$$ What I normally do at this point is to follow the Magnus-Neudecker convention and apply vec() to both sides $$\eqalign{ {\rm vec}(dF) &= (AA^T\otimes I)\,{\rm vec}(dX) \cr d{\rm vec}(F) &= (AA^T\otimes I)\,\,d{\rm vec}(X) \cr\cr \frac {\partial\,{\rm vec}(F)} {\partial\,{\rm vec}(X)^T} &= AA^T\otimes I \cr }$$ If you don't use vectorization, then you have to deal with $\frac{\partial F}{\partial X}$ as a full-blown fourth-order tensor. In which case index notation is the best way to proceed.

In any case, the derivative is definitely not $AA^T$, which is just a matrix, i.e. a second-order tensor.