Gradient of squared norm

vector analysis

I am looking for the gradient of the function

$$
f(x)= \dfrac{1}{2} \Vert A^Tx \Vert^2 – b^Tx \, .
$$

Well so far I came up with

$$
\nabla f(x) = AA^Tx-b
$$

because

$$
\begin{aligned}
f(x) &= \dfrac{1}{2} \Vert A^Tx \Vert^2 – b^Tx \\
&= \dfrac{1}{2} (A^Tx)^T(A^Tx) – b^Tx \\
&= \dfrac{1}{2} x^TAA^Tx – b^Tx \\
\end{aligned}
$$

and therefore

$$
\begin{aligned}
\nabla f(x) &= \dfrac{1}{2} (AA^T + (AA^T)^T)x – b \\
&= \dfrac{1}{2} (AA^T + AA^T)x – b \\
&= \dfrac{1}{2} \cdot 2 \cdot AA^T x – b \\
&= AA^T x – b \\
\end{aligned}
$$

Is my reasoning correct? Is there a shorter way to calculate the gradient?

Best Answer

We can compute in the following way as well, using the identity $\|x-y\|^2 = \|x\|^2 - 2\langle x,y\rangle +\|y\|^2$:

$$\begin{align}f(x+h) - f(x) &= \frac{1}{2}\|A^T(x + h)\|^2 - \frac{1}{2}\|A^Tx\|^2 -b^T(x+h) + b^Tx\\ &= \langle A^Tx,A^T h\rangle - b^T h\\ &= \langle AA^Tx - b,h\rangle\end{align}$$ so the gradient $\nabla f(x)$ is $AA^Tx - b$.