[Math] Derivative of AXB with respect to X

derivativeslinear algebramatrices

Assume that $A\in\mathbb{R}^{m\times m}$ and $B\in\mathbb{R}^{n\times n}$ are two constant matrices. How can I find the partial derivative of $AXB$ with respect to $X$ in which $X\in\mathbb{R}^{m\times n}$? In fact, how to compute $$\frac{\partial(AXB)}{\partial X}.$$

I think that it is easy to see that
$$\mathrm{vec}(AXB)=(B^T\otimes A)\mathrm{vec}(X),$$
where the notation $\otimes$ denotes the Kronecker product, and $\mathrm{vec}(X)$ is the vectorization of the matrix $X$.
Therefore, we have
$$\frac{\partial ((B^T\otimes A)\mathrm{vec}(X))}{\partial \mathrm{vec}(X)}=B^T\otimes A.$$
Here, I can not understand what is the relationship between $\frac{\partial(AXB)}{\partial X}$ and $\frac{\partial ((B^T\otimes A)\mathrm{vec}(X))}{\partial \mathrm{vec}(X)}$?

Thank you very much for the help.

Best Answer

When vectorizing the $m\times n$ matrix $X$, you obtain a vector $v =\mathrm{vec}(X)$ whose $i$th element is given by $v_i = X_{i\%m, i//m+1},$ where $i//m$ is the integer division and $i\%m$ is the remainder of the integer division.

Now consider what you mean by $$\frac{\partial AXB}{\partial X},$$ You are taking the derivative of an object with two indices with respect to an object with two indices, so you are looking at all terms of the form $$\frac{\partial [AXB]_{i,j}} {\partial X_{k,l}},$$ the matrix derivative vectorizes this indexing along two indices, the row is the position along $i,j$, the column is the position along $k,l$, so $$ \frac{\partial [AXB]_{i,j}} {\partial X_{k,l}}= (B\otimes A)_{m(j-1)+i, m(l-1)+k},$$ where the indexing reverses the vectorization operation.

Related Question