Derivative of Matrix with respect to a Matrix

linear algebramatrix-calculusmultivariable-calculus

I want to calculate the derivative of dot product of two Matrices of not the same order.

$X = \begin{bmatrix}x_{11} & x_{12} & x_{13}\\x_{21} & x_{22} & x_{23}\\x_{31} & x_{32} & x_{32}\end{bmatrix}$

$y= \begin{bmatrix}y_{11} & y_{12}\\y_{21} & y_{22}\\y_{31} & y_{32}\end{bmatrix}$

Problem is I can't figure out What does it mean to derivative of matrix with respect of matrix individual elements.
I tried to use the sum notation to calculate derivative of a single element of the resultant matrix.

$c_{i,j} = \sum_{k=1}^na_{i,k}\cdot b_{k,j}$

$\frac{\partial (X y)_{11}}{\partial X} =
\begin{bmatrix}y_{11} & y_{12} & y_{21}\\ 0 & 0 & 0 \\ 0 & 0 & 0\end{bmatrix}$

and the other partial Derivatives are similar to this.
I want to know that what is

$\frac{\partial Xy}{\partial X} = ?$

I can't figure out how to get this when the element derivative itself is a matrix.
and the matrices as inputs are not even of the same order.

Best Answer

Since $$Xy = \mathrm{vec}(Xy) = \mathrm{vec}(IXy) = (y\otimes I)'\mathrm{vec}(X)$$ take the derivative wrt $\mathrm{vec}(X)$ to obtain $y\otimes I$. This is consistent with the comment of Ben Grossmann as it is the "vectorization" of said fourth order tensor.

Related Question