Derivative of a trace with second order Kronecker product

derivativeskronecker productmatrix-calculustrace

I am trying to compute the derivative of $J$ with respect to $F$.
when
$$
J = \mathrm{Tr}\lbrack(I_{N} \otimes F)^{T}A(I_{N} \otimes F)B\rbrack
$$

$$
F \in \mathbb{R}^{N \times Nn},\ \ A \in \mathbb{R}^{NN \times NN}, \ \ B \in \mathbb{R}^{NNn \times NNn}
$$

$ B$ is a symmetric matrix

I have noted there are similar posts regarding the derivative involving the trace of a Kronecker product.
But I am not sure how to solve it when there is a second-order equation.
Thank you a lot in advance!

Best Answer

Define the matrices $$\eqalign{ X &= I\otimes F \\ G &= (A+A^T)XB \\ }$$ Then the cost function can be written as $$\eqalign{ {\cal J} &= A^TX:XB \\ }$$ where a colon denotes the trace/Frobenius product, i.e. $$M:N = {\rm Tr}(M^TN)$$ Next calculate the differential of the cost function. $$\eqalign{ d{\cal J} &= A^TdX:XB + A^TX:dX\,B \\ &= dX:AXB + A^TXB:dX \\ &= (A+A^T)XB:dX \\ &= G:dX \\ &= G:(I\otimes dF) \\ }$$ At this point, calculate the SVD of $G$ $$\eqalign{ &G = \sum_{k=1}^r \sigma_ku_kv_k^T \\ &u_k \in {\mathbb R}^{NN\times 1},\quad &r,\sigma_k \in {\mathbb R} \\ &v_k \in {\mathbb R}^{NNn\times 1},\quad &r = {\rm rank}(G) \\ }$$ Reshape the singular vectors into matrices (unstack ${\tt1}$ column into $N$ columns) $$\eqalign{ U_k &= {\rm Reshape}(u_k,\,\,N\times N)\;&\iff\; u_k&= {\rm vec}(U_k) \\ V_k &= {\rm Reshape}(v_k,\,Nn\times N) \;&\iff\;\;v_k&= {\rm vec}(V_k) \\ }$$ and use them to finish the calculation of the gradient. $$\eqalign{ d{\cal J} &= \sum_{k=1}^r \sigma_ku_kv_k^T:(I\otimes dF) \\ &= \sum_{k=1}^r \sigma_ku_k^T(I\otimes dF)v_k \\ &= \sum_{k=1}^r \sigma_k{\rm vec}(U_k)^T{\rm vec}(dF\,V_k) \\ &= \sum_{k=1}^r \sigma_kU_k:(dF\,V_k) \\ &= \sum_{k=1}^r \sigma_kU_kV_k^T:dF \\ \frac{\partial{\cal J}}{\partial F} &= \sum_{k=1}^r \sigma_kU_kV_k^T \\ }$$

Update

Based on the results of this post, we can calculate the solution without resorting to the SVD of $G$. Instead we'll use a decomposition involving the standard basis $E$-matrices $$\eqalign{ G &\in {\mathbb R}^{JK\times PQ},\qquad E_{kq} \in {\mathbb R}^{K\times Q},\quad C_{kq} \in {\mathbb R}^{J\times P} \\ G &= \sum_{k=1}^{K}\sum_{q=1}^{Q} C_{kq}\otimes E_{kq} \\ C_{kq} &= \sum_{j=1}^{J}\sum_{p=1}^{P} G_{(jK-K+k)(pQ-Q+q)}\;E_{jp} \\ }$$ Note that the trace of each $C_{kq}$ coefficient is a sum over a few elements of $G$ $$\eqalign{ {\rm Tr}(C_{kq}) &= \sum_{j=1}^{J} G_{(jK-K+k)(jQ-Q+q)} \\ }$$ Set $\,(J,K,P,Q)\to(N,N,N,Nn)\,$ so that the matrices $\,(C_{kq},I)\,$ will have the same dimensions, as will $\,(E_{kq},F).\,$ Then recalculate the gradient $$\eqalign{ d{\cal J} &= G:(I\otimes dF) \\ &= \sum_{k=1}^{N}\sum_{q=1}^{Nn}\;(C_{kq}\otimes E_{kq}):(I\otimes dF) \\ &= \sum_{k=1}^{N}\sum_{q=1}^{Nn}\;(C_{kq}:I)\,(E_{kq}:dF) \\ &=\left(\sum_{k=1}^{N}\sum_{q=1}^{Nn}\; E_{kq}\;{\rm Tr}(C_{kq})\right):dF\\ \frac{\partial{\cal J}}{\partial F} &= \sum_{k=1}^{N}\sum_{q=1}^{Nn}\;E_{kq}\,{\rm Tr}(C_{kq}) \\ }$$ This expression appears more complicated than the previous one, however it can be evaluated using nothing more than the (shuffled and summed) elements of $G$.

The formula for the components of the gradient show this quite clearly $$\eqalign{ \frac{\partial{\cal J}}{\partial F_{kq}} \;=\; {\rm Tr}(C_{kq}) \;=\; \sum_{j=1}^{N} G_{(jN-N+k)(jnN-nN+q)} \\\\ }$$

Here's a way to express the gradient without requiring any factorizations. $$\eqalign{ \frac{\partial{\cal J}}{\partial F} \;=\; \sum_{k=1}^N\; \big(e_k^T\otimes I_N\big)\, \big(A+A^T\big)\,\big(I\otimes F\big)\,B\, \big(e_k\otimes I_{Nn}\big) \\ }$$ where $e_k$ is the $k^{th}$ column of $I_N$

Related Question