Gradient of a complex valued matrix function but with real domain

complex-analysislinear algebramatrix-calculus

Let $f: \mathbb{C}^{N\times M}\rightarrow \mathbb{R}$ and $g: \mathbb{R}^{N\times M}\rightarrow \mathbb{C}^{N \times M}, N\geq M $ and $F = f \circ g$. I am trying to compute the gradient of $F$ w.r.t. $\mathbf{X} \in \mathbb{R}^{N\times M}$, i.e., $\nabla_\mathbf{X} f(g(\mathbf{X}))$ but I am struggling with the chain rule because of the complex domain. What is the dimension of the final gradient matrix?

As an example, I have: $g(\mathbf{X})=e^{i\mathbf{X}}$ and $f(\mathbf{Y})=|| \mathbf{A}-\mathbf{YB}||_F^2$ ($\mathbf{A}$ and $\mathbf{B}$ complex as well).

Thank you in advance.

Best Answer

Let $E=\exp(iX)$ then your example concerns the function $$\eqalign{ \def\LR#1{\left(#1\right)} \def\c#1{\color{red}{#1}} \def\CLR#1{\c{\LR{#1}}} \def\op#1{\operatorname{#1}} \def\trace#1{\op{Tr}\LR{#1}} \phi(X) &= \|A-EB\|_F^2 \cr &= \LR{A-EB}^*:\CLR{A-EB} \cr &= M^*:\c{M} \cr }$$ where a colon denotes the trace/Frobenius product, i.e. $\,\,\,A:B=\trace{A^TB}$

Calculate the Wirtinger differential of this function $$\eqalign{ d\phi \;=\; M^*:dM + M:dM^* \;=\; 2\,{\mathcal Re}(M^*:dM)\cr }$$ Continuing $$\eqalign{ M^*:dM &= -M^*:dE\,B \cr &= \c{-M^*B^T}:d\exp(iX) \cr &= \c{C}:d\exp(iX) \cr &= C:d\LR{\sum_{k=0}^\infty q_kX^k} \\ &= C:\sum_{k=1}^\infty q_k\sum_{j=1}^kX^{j-1}\,dX\,X^{k-j} \cr &= \CLR{\sum_{k=1}^\infty q_k\sum_{j=1}^k\:X^{k-j}C^TX^{j-1}}^{\c T}:dX \cr &= \c{G}:dX \cr }$$ where, in addition to the Taylor series for the exponential $\LR{{\rm with\,\,} q_k=\frac{i^k}{k!}},\;$ I have introduced the matrices $(C,G)$ to hide some messy expressions.

Now we are in a position to write (recalling that $X$ is real) $$\eqalign{ d\phi &= (G+G^*):dX \cr \frac{\partial\phi}{\partial X} &= (G+G^*) \;=\; 2\,{\mathcal Re}(G) \cr }$$

Update

After writing the above, I noticed that your matrices are rectangular, which means you are applying the exponential function element-wise.

This makes the Taylor series unnecessary (and the result much simpler) because $$dE = iE\odot dX \qquad\quad $$

Picking up midway through the previous derivation, $$\eqalign{ M^*:dM &= C:dE \\ &= C:(iE\odot dX) \\&= \CLR{iE\odot C}:dX \\&= \c{H}:dX \\ \frac{\partial\phi}{\partial X} &= (H+H^*) \;=\; 2\,{\mathcal Re}(H) \\ }$$ where $\odot$ denotes the elementwise/Hadamard product.

Related Question