Kronecker product and vectorization of tensor modal product

kronecker productlinear algebramatricestensors

Given a tensor $\mathcal{S}\in\mathbb{R}^{n_1\times\ldots\times n_d}$ and a matrix $M\in \mathbb{R}^{m_k\times n_k}$, let $\mathcal{A}$ be the modal product of $\mathcal{S}$ and $M$ along the $k$-th mode, i.e. $\mathcal{A}_{(k)}=M \mathcal{S}_{(k)}$.
I would like to show the formula $$\operatorname{vec}({\mathcal{A}})=(I_{n_{k+1}\ldots n_d}\otimes M\otimes I_{n_1\ldots n_{k-1}})\cdot \operatorname{vec}(\mathcal{S}).$$ I understand how to show it when $k = 1$ because then $$\operatorname{vec}(\mathcal{A}) = \operatorname{vec}(\mathcal{A}_{(1)})=\operatorname{vec}(M\mathcal{S}_{(1)}) = \operatorname{vec}(M\mathcal{S}_{(1)}I_{n_2\ldots n_d})=(I_{n_2\ldots n_d}\otimes M)\operatorname{vec}(\mathcal{S}_{(1)}) = (I_{n_2}\otimes \ldots \otimes I_{n_d}\otimes M)\operatorname{vec}(\mathcal{S})$$ where I used the identiy $\operatorname{vec}(ABC) = (C^T\otimes A)\operatorname{vec}(B)$. However, if $k\neq 1$, it seems we don't have the identity $\operatorname{vec}(\mathcal{A}) = \operatorname{vec}(\mathcal{A}_{(k)})$, so how do I deal with those cases?

NB: Here $\mathcal{A}_{(k)}$ denotes the modal unfolding of the tensor $\mathcal{A}$ along the mode $k$ which is completely defined by $$\mathcal{A}_{(k)}(\hat{i}_k, \operatorname{col}(\hat{i}_k, \hat{n}_k)) = \mathcal{A}(i)$$ where $\hat{i}_k=[i_1,\ldots, i_{k-1},i_{k+1},\ldots i_d]$, $\hat{n}_k=[n_1,\ldots, n_{k-1},n_{k+1},\ldots n_d]$ and $\operatorname{col}(i,n)=i_1+(i_2-1)n_1+(i_3-1)n_1n_2+\ldots + (i_d-1)n_1\ldots n_{d-1}$.

Best Answer

One approach is to note that $\mathcal S$ can be decomposed as the sum of simple tensors, $$ \mathcal S =\sum_{j=1}^r \bigotimes_{i=1}^d S_{ij} = \sum_{j=1}^r S_{1,j} \bar \otimes \cdots \bar \otimes S_{k-1,j} \bar \otimes S_{k,j}\bar \otimes S_{k+1,j} \bar \otimes \cdots \bar \otimes S_{d,j}, $$ where $S_{ij} \in \Bbb R^{n_i}$ for all $j$ and $\bar \otimes$ denotes the "tensor product". Let $\phi_M^{(k)}$ denote the linear map associated with the modal product taken here. That is, $\phi_{M}^{(k)}$ is defined so that $\phi_M^{(k)}(\mathcal S) = \mathcal A$.

Note/show the following facts:

  • For all $j$, $$ \phi_M^{(k)}[S_{1,j} \bar \otimes \cdots \bar \otimes S_{k-1,j} \bar \otimes S_{k,j}\bar \otimes S_{k+1,j} \bar \otimes \cdots \bar \otimes S_{d,j}] = \\ S_{1,j} \bar \otimes \cdots \bar \otimes S_{k-1,j} \bar \otimes (MS_{k,j})\bar \otimes S_{k+1,j} \bar \otimes \cdots \bar \otimes S_{d,j} $$

  • For all $j$, $$ \operatorname{vec}[S_{1,j} \bar \otimes \cdots \bar \otimes S_{k-1,j} \bar \otimes S_{k,j}\bar \otimes S_{k+1,j} \bar \otimes \cdots \bar \otimes S_{d,j}] = \\ S_{d,j} \otimes \cdots \otimes S_{k+1,j} \otimes S_{k,j} \otimes S_{k-1,j} \otimes \cdots \otimes S_{1,j} $$

  • Denoting $ \hat M_k = I_{n_{k+1}\ldots n_d}\otimes M\otimes I_{n_1\ldots n_{k-1}}$, we have $$ \hat M_k(S_{d,j} \otimes \cdots \otimes S_{k+1,j} \otimes S_{k,j} \otimes S_{k-1,j} \otimes \cdots \otimes S_{1,j}) = \\ S_{d,j} \otimes \cdots \otimes S_{k+1,j} \otimes (MS_{k,j}) \otimes S_{k-1,j} \otimes \cdots \otimes S_{1,j} $$

From there, the proof is straightforward:

\begin{align} \operatorname{vec}(\mathcal A) &= \operatorname{vec} \left[\phi_M^{(k)}(\mathcal S)\right] \\ & = \operatorname{vec} \left[\phi_M^{(k)}\left[\sum_{j=1}^r S_{1,j} \bar \otimes \cdots \bar \otimes S_{k-1,j} \bar \otimes S_{k,j}\bar \otimes S_{k+1,j} \bar \otimes \cdots \bar \otimes S_{d,j} \right]\right] \\ & = \operatorname{vec} \left[\sum_{j=1}^r \phi_M^{(k)}\left[ S_{1,j} \bar \otimes \cdots \bar \otimes S_{k-1,j} \bar \otimes S_{k,j}\bar \otimes S_{k+1,j} \bar \otimes \cdots \bar \otimes S_{d,j} \right]\right] \\ & = \operatorname{vec} \left[\sum_{j=1}^r S_{1,j} \bar \otimes \cdots \bar \otimes S_{k-1,j} \bar \otimes (MS_{k,j})\bar \otimes S_{k+1,j} \bar \otimes \cdots \bar \otimes S_{d,j} \right] \\ & = \sum_{j=1}^r \operatorname{vec} \left[ S_{1,j} \bar \otimes \cdots \bar \otimes S_{k-1,j} \bar \otimes (MS_{k,j})\bar \otimes S_{k+1,j} \bar \otimes \cdots \bar \otimes S_{d,j} \right] \\ & = \sum_{j=1}^r S_{d,j} \otimes \cdots \otimes S_{k+1,j} \otimes (MS_{k,j}) \otimes S_{k-1,j} \otimes \cdots \otimes S_{1,j} \\ & = \sum_{j=1}^r \hat M\left[S_{d,j} \otimes \cdots \otimes S_{k+1,j} \otimes S_{k,j} \otimes S_{k-1,j} \otimes \cdots \otimes S_{1,j}\right] \\ & = \hat M\left[\sum_{j=1}^r S_{d,j} \otimes \cdots \otimes S_{k+1,j} \otimes S_{k,j} \otimes S_{k-1,j} \otimes \cdots \otimes S_{1,j}\right] \\ & = \hat M\left[\sum_{j=1}^r \operatorname{vec}\left[S_{1,j} \bar \otimes \cdots \bar \otimes S_{k-1,j} \bar \otimes S_{k,j}\bar \otimes S_{k+1,j} \bar \otimes \cdots \bar \otimes S_{d,j}\right]\right] \\ & = \hat M \operatorname{vec}\sum_{j=1}^r \left[S_{1,j} \bar \otimes \cdots \bar \otimes S_{k-1,j} \bar \otimes S_{k,j}\bar \otimes S_{k+1,j} \bar \otimes \cdots \bar \otimes S_{d,j}\right] = \hat M \operatorname{vec}(\mathcal S), \end{align}

which is what we wanted.


We could also save ourselves a lot of notational headache if we prove it for $k = 1$, then generalize that proof as follows. Let $\tau$ denote the "transpose" map over $\Bbb R^{n_1 \times \cdots \times n_d}$ that switches dimensions $1$ and $k$ of $\mathcal S$. If I've understood your terminology correctly, then I believe this means we're swapping the $1$st and $k$th "modes". Similarly, let $\sigma$ denote the map that does the same thing for $\mathcal A$ (noting that $\mathcal S$ and $\mathcal A$ have different shapes).

Note that there is a permutation matrix $P_\tau$ such that $$ \operatorname{vec}(\tau(\mathcal S)) = P \operatorname{vec}(\mathcal S) $$ holds for all $\mathcal S \in \Bbb R^{n_1 \times \cdots \times n_d}$, and similarly there is a corresponding $P_\sigma$. We can write \begin{align} \operatorname{vec}(\phi_M^{(k)} \mathcal S) & = \operatorname{vec}(\sigma^{-1}(\phi_M^{(1)} \tau(\mathcal S))) \\ & = P_{\sigma}^{-1}\operatorname{vec}(\phi_M^{(1)} \tau(\mathcal S)) \\ & = P_{\sigma}^{T}\hat M_{1} \operatorname{vec}(\tau(\mathcal S)) = P_{\sigma}^{T} \hat M_1 P_{\tau} \operatorname{vec}(\mathcal S). \end{align}

So, it suffices to show that $P_{\sigma}^{T} \hat M_1 P_{\tau} = \hat M_k$. To that end, it suffices to show that $v^\top P_{\sigma}^{T} \hat M_1 P_{\tau} w = v^\top \hat M_k w$ for all appropriately sized column-vectors $v,w$. Moreover, it suffices to show that this holds for vectors of the form $v = v_1 \otimes v_2 \otimes v_3$ and $w = w_1 \otimes w_2 \otimes w_3$ where $v_1,w_1 \in \Bbb R^{n_1 \cdots n_{k-1}}$, $v_2 \in \Bbb R^{m_k}$, $w_2 \in \Bbb R^{n_k}$, and $v_3,w_3 \in \Bbb R^{n_{k+1} \cdots n_d}$.

Related Question