[Math] Hessian of Frobenius norm

hessian-matrixleast squaresmatricesmatrix-calculus

I want to find the Hessian of the following function, $F(\mathbf{X}) = \frac{1}{2}\Vert \mathbf{Y} – \mathbf{AX}\Vert _F^2$.

My try: Using trace formula for Frobenius norm, $F(\mathbf{X})$ can be written as,
$$
F(\mathbf{X}) = \frac{1}{2}\operatorname{tr}[(\mathbf{Y} – \mathbf{AX})(\mathbf{Y} – \mathbf{AX})^T] = \frac{1}{2}\left(\operatorname{tr}(\mathbf{YY}^T) – \operatorname{tr}(\mathbf{YX}^T\mathbf{A}^T) – \operatorname{tr}(\mathbf{AXY}^T) + \operatorname{tr}(\mathbf{AXX}^T\mathbf{A}^T) \right)
$$

So,
$$
\nabla F(\mathbf{X}) = \frac{1}{2}\left( 0 – \mathbf{A}^T\mathbf{Y} – \mathbf{A}^T\mathbf{Y} + \mathbf{A}^T\mathbf{AX} + \mathbf{A}^T\mathbf{AX}\right) = – \mathbf{A}^T\mathbf{Y} + \mathbf{A}^T\mathbf{AX}
$$

I used formula 102, 101 and 109 of Matrix Cookbook. Now,
$$
\nabla^2 F(\mathbf{X}) = 0 + \frac{d\mathbf{A}^T\mathbf{AX}}{d\mathbf{X}} = \mathbf{A}^T\mathbf{A}
$$

I'm not, particularly, sure about the last step whether I can write this. Please help me to get out of this confusion.

A related question was asked in Derivative of a Matrix w.r.t. a Matrix.

Best Answer

We look for the linear and the bilinear terms in the multivariate Taylor approximation at $x$ (in subscript): $$f(x+u) \approx f_x + Df_x\,(u) + \tfrac12 Hessf_x\,(u,u).$$ Let $f(x)=\|Ax\|^2$, then $$ \langle A(x+h),A(x+h)\rangle=\|Ax\|^2+2\langle Ax,h\rangle+\langle Ah,Ah\rangle $$ The Hessian term can be written as $\langle Ah,Ah\rangle=\tfrac12 \langle h,2A^TAh\rangle$, thus the Hessian is $H=2A^TA$, a constant matrix. The above works for both L2- and Frobenius norms.