Derivative with respect to vectorized inverse Kronecker product

derivativesdifferentialkronecker productmatrix-calculusvectorization

I am trying to derive the gradient of a function I wish to optimize, and wish to obtain the following derivative:
$$
\frac{\partial}{\partial \pmb{x}} \left(\pmb{I} – \pmb{X} \otimes \pmb{X} \right)^{-1} \pmb{y}
$$

with $\pmb{x} = \mathrm{vec}(\pmb{X})$, $\pmb{X}$ being a square asymetric matrix and $\pmb{y}$ a vector that is not a function of $\pmb{x}$, and $\otimes$ the Kronecker product. My thought was to first write:
$$
\left( \pmb{y}^{\top} \otimes \pmb{I} \right) \mathrm{vec}\left( \left(\pmb{I} – \pmb{X} \otimes \pmb{X} \right)^{-1}\right)
$$

next to let $\pmb{f} = \mathrm{vec}\left( \left(\pmb{I} – \pmb{X} \otimes \pmb{X} \right)^{-1}\right)$ and then to express the differential of $\pmb{f}$. I got to:
$$
d\pmb{f} = \left(\left(\pmb{I} – \pmb{X} \otimes \pmb{X} \right)^{-\top} \otimes \left(\pmb{I} – \pmb{X} \otimes \pmb{X} \right)^{-1}\right) \left( \mathrm{vec}\left( (d\pmb{X}) \otimes \pmb{X} \right) + \mathrm{vec}\left( \pmb{X} \otimes (d\pmb{X})\right) \right)
$$

in which $-\top$ is short for the transpose of an inverse. This seems close to the answer, but not quite there yet. I guess I am getting lost in trying to express $\mathrm{vec}\left( (d\pmb{X}) \otimes \pmb{X} \right)$ in terms of $d\pmb{x}$.

Edit: continuing this, I recognized there must be some permutation matrix $\pmb{P}$ such that:
$$
\pmb{P}\mathrm{vec}( (d\pmb{x})\pmb{x}^{\top} ) = \mathrm{vec}((d\pmb{X}) \otimes \pmb{X})
$$

which I can use to further derive:
$$
\begin{align}
d\pmb{f} &= \left(\left(\pmb{I} – \pmb{B} \otimes \pmb{B} \right)^{-\top} \otimes \left(\pmb{I} – \pmb{B} \otimes \pmb{B} \right)^{-1}\right)\pmb{P}\left((\pmb{b} \otimes \pmb{I}) + (\pmb{I} \otimes \pmb{b})\right)d\pmb{b} \\
\frac{\partial \pmb{f}}{\partial \pmb{b}} &= \left(\left(\pmb{I} – \pmb{B} \otimes \pmb{B} \right)^{-\top} \otimes \left(\pmb{I} – \pmb{B} \otimes \pmb{B} \right)^{-1}\right) \pmb{P}\left((\pmb{b} \otimes \pmb{I}) + (\pmb{I} \otimes \pmb{b})\right).
\end{align}
$$

Which seems plausible. Thus, all that seems to be needed is an expression for $\pmb{P}$. I guess that will take a similar form as this answer, but I am not sure about it.

Best Answer

Let $X\in {\mathbb R}^{n\times n}$ and $E$ be the identity matrix of the same size.
Let's also denote the $k^{th}$ column of $X$ by $x_k$.

Define the matrices $$\eqalign{ A &= (E\otimes E - X\otimes X),\quad M &= \pmatrix{E\otimes x_1\cr E\otimes x_2\cr\vdots\cr E\otimes x_n} \cr }$$ Calculate the differential of $A$. $$\eqalign{ dA &= -(X\otimes dX+dX\otimes X) \cr da &= {\rm vec}(dA) = -(M\otimes E+E\otimes M)\,dx \cr }$$ Now we can answer the question. $$\eqalign{ w &= A^{-1}y \cr dw &= dA^{-1}y \cr &= -A^{-1}\,dA\,A^{-1}y \cr &= -{\rm vec}(A^{-1}\,dA\,w) \cr &= -(w^T\otimes A^{-1})\,da \cr &= (w^T\otimes A^{-1})\,(M\otimes E+E\otimes M)\,dx \cr \frac{\partial w}{\partial x} &= (w^T\otimes A^{-1})\,(M\otimes E+E\otimes M) \cr }$$