Gradient of $C \mapsto\frac{1}{2}\left\lVert CA – BC \right\rVert_F^2$

derivativesmatricesmatrix-calculusscalar-fields

Given the matrices $A \in \mathbb{R}^{n \times n}$ and $B \in \mathbb{R}^{m \times m}$, let the scalar field $f : \mathbb{R}^{m \times n} \to \mathbb{R}$ be defined by

$$ f(C) := \frac{1}{2}\left\lVert CA – BC \right\rVert_F^2 $$

What is the gradient $\nabla f$?


I am trying to differentiate this function w.r.t. to $C$ but I cannot find a way to manipulate the expression that would enable me to do so. I've also tried a definition of derivative adapted in this case but I don't endup with something useful at first glance. I endup with a linear map $df(C)$ defined by the expression

$$
df(C)E = \text{trace} \left\{ (CA -BC)^T (EA – BE)\right\} = \left\langle CA -BC,EA-BE\right\rangle
$$

which then leads to me to

$$
df(C) = \left\langle AA^TC^T – AC^TB^T – A^TC^TB + C^TB^TB, \cdot \right\rangle
$$

Is this expression correct?

Best Answer

Let

$$ f({\bf X}) := \frac12 \left\| {\bf X} {\bf A} - {\bf B} {\bf X} \right\|_{\text{F}}^2 $$

Using the definition of the Frobenius norm and the cyclic property of the trace,

$$ \nabla_{{\bf X}} f({\bf X}) = \cdots = \color{blue}{({\bf X} {\bf A} - {\bf B} {\bf X}) {\bf A}^\top - {\bf B}^\top ({\bf X} {\bf A} - {\bf B} {\bf X})} $$


Addendum

Suppose that we would like to find where the gradient vanishes. We then have the following linear matrix equation.

$$ ({\bf X} {\bf A} - {\bf B} {\bf X}) {\bf A}^\top - {\bf B}^\top ({\bf X} {\bf A} - {\bf B} {\bf X}) = {\bf O}_{m \times n} $$

Vectorizing both sides, we obtain the following homogeneous linear system

$$ \left( \left( {\bf A} {\bf A}^\top \otimes {\bf I}_m \right) - \left( {\bf A} \otimes {\bf B} \right) - \left( {\bf A} \otimes {\bf B} \right)^\top + \left( {\bf I}_n \otimes {\bf B}^\top {\bf B}\right) \right) \operatorname{vec} ({\bf X}) = {\bf 0}_{mn} $$


Related