Given the gradient descent algorithm:
$$x_{i+1} = x_i – \alpha_iD_i\nabla f(x_i)$$
with $\alpha_i$ suffiently small and $D_i > 0$ (positive definite)
I was under the assumption and the algorithm will always converge to a global minimum if $f$ is convex or to a local minimum. Is this not the case?
Does $f$ have to be convex for a global minimum to be found? In general, is gradient descent really finding a stationary point? with further analysis being required to determine if it's a min/max/inflection?
Best Answer
There are some provisos (in addition to the obvious requirement that $f$ is sufficiently smooth).
That said, suppose gradient descent does converge to some point. This point might be a saddle point, or even a local maximum (if $x_0$ is a saddle point, you will never leave it, for example). But it's very unlikely. You will converge to a local minimum for almost all starting points and moreover if you do converge to a saddle point, you can "slide off" by applying a small random perturbation to your solution and solving again.
If $f$ is convex, all critical points of $f$ are global minima and so, if gradient descent converges, it does so to a global minimum. You might also converge to a global minimum if $f$ is not convex. Depends on the initial guess, step size, preconditioner, line search strategy... there are no guarantees.