A matrix calculus problem with vectorization

matricesmatrix-calculusvector analysisvectorization

I'm trying to find out the derivative of $vec(AA^T)$ w.r.t to $vec(A)$, where $A$ is a $m$ by $n$ matrix
I use a simple $3$ by $2$ case and find it entry-wisely.
I guess the answer would be $A^T\otimes I_m+[(I_m\otimes a_1),\ (I_m\otimes a_2),\ \ldots,\ (I_m\otimes a_n)]^T$ where $a_i$'s is columns of $A$ (not sure it is correct)
Is there any systematical way to find out such an expression?
My intuitive thought is to express $vec(AA^T)$ as some differentiable function of $vec(A)$ and utilize some simple matrix calculus rule.
Any tips would be appreciated!

Best Answer

I think that a common way to do this is to use index notation $A_{ij}$ where $i$ goes over rows and $j$ goes over columns of the matrix. Then

$$F(A)_{ij} = \sum_k A_{ik} A_{jk}$$

Notice that the transpose is just swapping the indices in the second $A_{jk}$. Now we can do the derivative. The trick is to use completely different dummy coefficients for the derivative matrix

$$\frac{\partial F_{ij}}{\partial A_{mn}} = \frac{\partial}{\partial A_{mn}} \sum_k A_{ik} A_{jk}$$

Now we will make use of the fact that all coefficients of the matrix are independent. If this is the case, then

$$\frac{\partial A_{ik}}{\partial A_{mn}} = \delta_{im} \delta_{kn}$$

Where $\delta_{ij}$ is the Kronecker delta function, namely, it is equal to 1 if the two coefficients are equal, and 0 otherwise. Thus, using distribution rule for differentiation, we get

$$\frac{\partial F_{ij}}{\partial A_{mn}} = \sum_k(\delta_{im} \delta_{kn} A_{jk} + A_{ik} \delta_{jm} \delta_{kn}) = A_{jn} \delta_{im} + A_{in} \delta_{jm} $$

There are tons of matrix calculus rules, but instead of remembering all of them sometimes it is easier to just derive them from scratch using index notation