Derivative of submatrix with respect to the whole block matrix

derivativeslinear algebramatricesmatrix-calculus

I am reading a research paper and getting stucked with how they derived a formula. Suppose that we have the following block matrix
$$\underset{d \times d}{\boldsymbol{A}} = \begin{bmatrix} \underset{q \times q}{\boldsymbol{A}_{11}} & \underset{q \times (d – q)}{\boldsymbol{A}_{12}} \\ \underset{(d – q) \times q}{\boldsymbol{A}_{21}} & \underset{(d – q) \times (d – q)}{\boldsymbol{A}_{22}} \end{bmatrix}$$
The formula involves taking the derivative of a quantity that involves $\boldsymbol{A}_{22}$ with respect to $\boldsymbol{A}$. In particular, that quantity is
$$\text{trace} (\boldsymbol{A}_{22} \boldsymbol{B}),$$
where $\boldsymbol{B}$ is matrix with dimension $(d – q) \times (d – q)$. We need to take the following derivative
$$\frac{\partial }{\partial \boldsymbol{A}} \text{trace} (\boldsymbol{A}_{22} \boldsymbol{B}).$$
In the context of my paper, $\boldsymbol{A}$ is a symmetric matrix, but if possible, I would assume $\boldsymbol{A}$ is a square matrix.

Please help me if you have an idea. Thank you so much.

Best Answer

$\def\m#1{\left[\begin{array}{l|l}#1\end{array}\right]}\def\mm#1#2{\left[\begin{array}{l|l}#1\\\hline #2\end{array}\right]}$ Let $\,p=(d-q)$ and write the $d\times d$ identity matrix in two different block forms $$\eqalign{ I_d &= \m{E_1&E_2}{} = \mm{I_q&0}{0&I_p} \\ E_1 &\doteq \mm{I_q}{0} \qquad\;\; E_2 \doteq \mm{0}{I_p} \\ }$$ The $(E_i,\;E_k)$ matrices can be used to extract the corresponding block of $A,\,$ e.g. $$\eqalign{ A_{ik} &= E_i^TAE_k \\ }$$ Now we can write the cost function and calculate its gradient $$\eqalign{ f &= {\rm Tr}(A_{22}B) \\ &= {\rm Tr}(E_2^TAE_2B) \\ &= {\rm Tr}(E_2BE_2^TA) \\ \frac{\partial f}{\partial A} &= E_2BE_2^T = \mm{0&0}{0&B} \\ }$$