[Math] sufficient (first-order) condition for optimality

optimization

Consider the problem

minimize $f(x)=||Ax-b||_2^2$,

where $A$ is an $m\times n$ matrix with $m\geq n$, and $b$ is a vector of length $m$. Assume that the rank of $A$ is equal to $n$.

We can write down the first-order necessary condition for optimality: If $x_*$ is a local minimizer, then $\triangledown f(x_*)=0$.

Is this also a sufficient condition?

Best Answer

Yes, this is also sufficient. The quick way to see this is simply to see that $f(x)$ is convex, so its extrema are its minimizers. Alternatively, you can calculate directly the derivative of $f$: if $f'(x) = 0$, then for every vector $v$,

\begin{align} 0 = f'(x) v & = \lim_{\epsilon \rightarrow 0} \frac{1}{\epsilon} [f(x + \epsilon v) - f(x)]\\ & = \frac{1}{\epsilon} [ \| A(x + \epsilon v) - b\|^2 - \|Ax - b\|^2 ] \\ & = \langle Ax - b, Av \rangle + \langle Av, Ax - b \rangle \\ & = \langle A^T A x - A^T b, v \rangle + \langle v, A^T Ax - A^T b \rangle \end{align} If this holds for every $v$, then $A^T Ax - A^T b = 0$. These are precisely the so-called normal equations that one can solve to find the minimizer for the linear least-squares problem.

Related Question