Derivative of function with the Kronecker product of a Matrix with respect to vech

derivativeskronecker productmatricesmatrix-calculusvectorization

I have $\Sigma$ a symmetric $2 \times 2$ matrix, and $\Sigma^{-1}$ is its inverse.

Now, $\tilde{\Sigma}^{-1}=\Sigma^{-1} \otimes I_{n \times n}$ (Kronecker product).

I have a function $Y=f(\tilde{\Sigma}^{-1})$ that gives a value in $\mathbb R$.

Let's define $\Phi_{\Sigma}=vech(\Sigma)$

Now, I am trying to get $\frac{\partial Y}{\partial \Phi_{\Sigma}}$.

So far I have

$\frac{\partial Y}{\partial \Phi_{\Sigma}^T} = \Bigg( \frac{\partial vec(\tilde{\Sigma}^{-1})}{\partial \Phi_{\Sigma}^T} \Bigg)^T \Bigg( \frac{\partial vec(Y)}{\partial vec(\tilde{\Sigma}^{-1})^T} \Bigg)$

I've been working on this an got that $\Bigg( \frac{\partial vec(Y)}{\partial vec(\tilde{\Sigma}^{-1})^T} \Bigg)$ is a vector with $n \times n$ elements.
Now, working with the first part of the derivative

$\Bigg( \frac{\partial vec(\tilde{\Sigma}^{-1})}{\partial \Phi_{\Sigma}^T} \Bigg)^T = \Bigg( \frac{\partial vec(\tilde{\Sigma}^{-1})}{\partial vec(\Sigma)^T} D_2 \Bigg)^T = \Bigg( vec \Big( \frac{\partial \tilde{\Sigma}^{-1}}{\partial \Sigma}\Big) D_2 \Bigg)^T = \Bigg( vec \Big( \frac{\partial \tilde{\Sigma}^{-1}}{\partial \Sigma^{-1}} \frac{\partial \Sigma^{-1}}{\partial \Sigma} \Big) D_2 \Bigg)^T$
$= \Bigg( vec \Big( (I_2 \otimes I_n) (-\Sigma^{-1} \Sigma^{-1}) \Big) D_2 \Bigg)^T$

where $D_2$ is the duplication matrix

However, the matrices $(I_2 \otimes I_n)$ and $-\Sigma^{-1} \Sigma^{-1}$ are not conformable. So it is wrong. Also, since $\Bigg( \frac{\partial vec(Y)}{\partial vec(\tilde{\Sigma}^{-1})^T} \Bigg)$ is a vector with $n \times n$ elements, and $\frac{\partial Y}{\partial \Phi_{\Sigma}}$ is $3 \times 1$, so $\Bigg( \frac{\partial vec(\tilde{\Sigma}^{-1})}{\partial \Phi_{\Sigma}^T} \Bigg)^T$ should be $3 \times (n \times n)$.
May I ask for advice on solving this task?

Best Answer

For ease of typing, define $$\eqalign{ &M = \Sigma,\quad &N = \Sigma^{-1} \\ &R = M\otimes I,\quad &S = N\otimes I = R^{-1},\quad &f = f(S) \\ &h = {\rm vech}(M),\quad &v = {\rm vec}(M) \\ &D = D_2,\quad &v = Dh \\ }$$ You don't tell us anything about the function $f(S),\,$ so I'll assume you don't need help
calculating its gradient $G=\left(\frac{\partial f}{\partial S}\right)$

Before we begin, we need a few results from Wikipedia and this post which can be summarized $$\eqalign{ &A\in{\mathbb R}^{m\times n},\quad B\in{\mathbb R}^{p\times q} \\ &I_k\in{\mathbb R}^{k\times k}\qquad \big({\rm Identity\,Matrix}\big) \\ &a = {\rm vec}(A),\quad b={\rm vec}(B)\\ &x={\rm vec}(A^T) = K_{m,n}\,a\quad \big({\rm Commutation\,Matrix}\big) \\ &{\rm vec}(A\otimes B) = \left(I_n\otimes K_{q,m}\otimes I_p\right)(I_m\otimes I_n\otimes b)\,a \\ }$$ Using this, we can write $$\eqalign{ {\rm vec}(R) &= {\rm vec}\big(M\otimes I_n\big) \\ &= \Big(I_2\otimes K_{n,2}\otimes I_n\Big) \Big(I_2\otimes I_2\otimes{\rm vec}(I_n)\Big)\,v \\ &= Qv \\ }$$ Start by writing the differential of the function, then perform a sequence of changes of variables from $S\to R\to v\to h$. $$\eqalign{ df &= G:dS \\&= G:(-S\,dR\,S) \\&= -SGS:dR \\ &= -\operatorname{vec}\left(SGS\right):Q\,dv \\ &= -Q^T\operatorname{vec}\left(SGS\right):dv \\ &= -Q^T\operatorname{vec}\left(SGS\right):D\,dh \\ &= -D^TQ^T\operatorname{vec}\left(SGS\right):dh \\ \frac{\partial f}{\partial h} &= -D^TQ^T\operatorname{vec}\left(SGS\right) \\ \\ }$$ The trace/Frobenius product $\;A:B = {\rm Tr}\big(A^TB\big)\;$ is used in several steps.

The trace's cyclic property allows terms in such products to be rearranged in many ways, e.g. $$\eqalign{ A:B &= A^T:B^T &= B:A \\ A:BC &= B^TA:C &= AC^T:B \\ }$$ Several steps also made use of the fact that $(M,N)$ and therefore $(R,S)$ are symmetric matrices.