[Math] Derivative of Frobenius norm

derivativesmatrix-calculusnormed-spacesscalar-fields

I am trying to calculate the derivative of an energy function with respect to a vector xx. The energy is given by:

$$ψ(A)=∥A−I∥_F^2.$$

Where A is a square matrix with each column as x (a column vector):

$$A=[x_1 x_2 x_3 … x_n]$$

The aim is to find $$\frac{∂ψ}{∂x}$$

[Petersen 06] gives the derivative of a Frobenius norm as $$ \frac{∂∥X∥_F^2}{X}=2X$$ but I am unsure how to extend it to this case (presumably using the chain rule somehow).

Best Answer

Recall that the frobeniuns norm $\def\norm#1{\left\|#1\right\|_F}\norm{\cdot} \colon \mathrm{Mat}_{n,m}(\mathbf R) \to \mathbf R$ if given by $$ \norm A = \def\t{\mathop{\rm tr}}\t(A^t A)^{1/2} $$ and hence the derivative of $\norm{\cdot }^2$ is (we used $\t(A^t H) = \t(H^t A)$) $$ D(\norm{\cdot}^2)(A)H = 2 \t(A^t H) $$ If we denote, for given $x_1, \ldots, x_{i-1}, x_{i+1}, \ldots, x_m \in \mathbf R^n$, the map $x_i \mapsto [x_1, \ldots, x_n] \in \mathrm{Mat}_{n,m}(\mathbf R)$ by $A^{\hat x}$, we have by the chain rule, that the derivative of $x_i \mapsto \psi\bigl(A^\hat x(x_i)\bigr)$, is given by $$ D\psi(A^{\hat x}(x_i))DA^{\hat x}(x_i) $$ Now $A^{\hat x}$ is affine, hence $DA^{\hat x}(x_i)$ is the linear part $h \mapsto [0, \ldots, 0, h, 0, \ldots, 0] \in \mathrm{Mat}_{n,m}(\mathbf R)$ and $D\psi$ is given by $$ D\psi(A)H = 2\t\bigl((A-I)^t H\bigr) $$ Hence, $$ \frac{\partial \psi}{\partial x_i}(h) = D\psi(A)DA^{\hat x}(x_i)h = 2\t\bigl((A-I)^t[0,\ldots, 0, h, 0,\ldots, 0])\bigr) $$ Now $(A - I)^t = A^t - I^t$ has the rows $x_j^t - e_j^t$, and as $$ (A-I)^t[0,\ldots, 0, h,0,\ldots, 0] = [0, \ldots, (A^t - I^t)h,0,\ldots, 0] $$ taking the trace leaves us with $$ \frac{\partial \psi}{\partial x_i}(h) = 2(x_i^t - e_i^t)h $$