Derivative of vector with vectorization

derivativeslinear algebramatrix-calculusoptimizationvector analysis

I have these constraints on a cost function

$$
c = A+Bx=A+B\text{vec}\ (q^*q^\top),
$$

where $(c,A)\in\mathbb{R}^{100}$, $B\in\mathbb{C}^{100\times 81}$, $x\in\mathbb{C}^{81}$ and $q\in\mathbb{C}^9$. So $x=\text{vec}\ (q^*q^\top)$, which is the vectorization operator. I want to speed up my optimizer and therefore i require the gradient of the constraints (with respect to $q$). This is how far i have come:

$$
\begin{aligned}
dc = Bdx &= Bd\text{vec}\ (q^*q^\top)\\
&=B\text{vec}\ (q^*dq^\top+dq^*q^\top) \\
&=B\text{vec}\ (q^H:dq)+B\text{vec}\ (q^\top:dq^*)
\end{aligned}
$$

However, i cannot seem to get rid of the $\text{vec}$ operator. If i "matricize" the left side to remove the vectorization at the right side, i cannot get to $\frac{\partial c}{\partial q}$ anymore. Anyone got some brilliance for me?

Update: The last line of my derivation is incorrect i think. $q^H\in\mathbb{C}^{12}$ while $dq\in\mathbb{C}^{1\times 12}$, so you cannot use the Frobenius product here.

Best Answer

So far, you have $$ dc = B \operatorname{vec}(q^*\,dq^\top) + B\operatorname{vec}(dq^*\, q^\top). $$ To obtain the components of the gradient, it suffices to plug in $q = e_j$ (where $e_1,\dots,e_9$ denote the standard basis vectors). So, we have $$ \frac{\partial c}{\partial q_j} = B \operatorname{vec}(q^*\,e_j^\top) + B\operatorname{vec}(e_j\, q^\top). $$ We could rewrite this in terms of the Kronecker product to "unvectorize". Note that $\operatorname{vec}(v w^T) = w \otimes v$, so that $$ \frac{\partial c}{\partial q_j} = B (e_j \otimes q^*) + B(q \otimes e_j). $$


Another option is to go in the opposite direction: instead of vectorizing, unvectorize everything. Suppose that we have $$ B = \sum_{j=1}^k P_j \otimes Q_j, $$ with $P_j,Q_j$ of size $10 \times 3$ (such a decomposition can be computed with reshaping and either SVD or rank factorization). We then have $$ B \operatorname{vec}(q^*q^T) = \sum_{j=1}^k P_j \otimes Q_j \operatorname{vec}(q^*q^T) \\ = \sum_{j=1}^k \operatorname{vec}(Q_jq^*q^TP_j^T) \\ = \sum_{j=1}^k \operatorname{vec}(Q_jq^*(P_jq)^T). $$ In other words, if we unvectorize $c$ into the $10 \times 10$ matrix $C$, then we have $$ C = [\text{const.}] + \sum_{j=1}^k (Q_jq^*(P_jq)^T). $$

Related Question