[Math] Derivative of matrix-valued function with respect to matrix input

derivativesmatricesmatrix-calculus

I have the expression

$$\bf \phi = \bf X W$$

where $\bf X$ is a $20 \times 10$ matrix, $\bf W$ is a $10 \times 5$ matrix.

How can I calculate $\frac{d\phi}{d\bf W}$? What is the dimension of the result?

Best Answer

Let function $\mathrm F : \mathbb R^{n \times p} \to \mathbb R^{m \times p}$ be defined as follows

$$\rm F (X) := A X$$

where $\mathrm A \in \mathbb R^{m \times n}$ is given. The $(i,j)$-th entry of the output is

$$f_{ij} (\mathrm X) = \mathrm e_i^\top \mathrm A \, \mathrm X \, \mathrm e_j = \mbox{tr} \left( \mathrm e_j \mathrm e_i^\top \mathrm A \, \mathrm X \right) = \langle \mathrm A^\top \mathrm e_i \mathrm e_j^\top, \mathrm X \rangle$$

Hence,

$$\partial_{\mathrm X} \, f_{ij} (\mathrm X) = \color{blue}{\mathrm A^\top \mathrm e_i \mathrm e_j^\top}$$