[Math] Why would gradient descent ever diverge

convex optimizationgradient descentnumerical optimizationoptimization

I have a doubt . I read somewhere that gradient descent will diverge if the step size chosen is large. But the gradient descent say using exact line search says chose a step size only if it moves down i.e f[x[k+1]]< f[x[k]]..

what i read which led to this doubt
In some slides

Now ideally it choses the direction of negative gradient[this itself says that the direction is towards the inner contours] and the step size is chosen till the point[using exact line search ] where f keeps on decreasing..so at max which will be reached in case of anisotropic or [circular form] it should straight end at the minimum

Best Answer

Gradient Descent Method means each iteration you move from the current point to the next using the opposite direction of the gradient.

Each iteration is a function of 2 parameters:

  • Direction.
  • Step Size.

The Gradient Descnet direction only promises there is a small ball which within this ball the value of the function decrease (Unless you're on a stationary point).
Yet the size (Radius) of this ball isn't known.

There are many algorithms to find a valid step size.
One of them (Probably the hardest) is the Exact Line Search.
In practice better choice would be Backtracking.

For large Step Size you may get outside the "Ball" where the function is decreasing and practically find a worse point.
Iterating this might cause a diverge.

Related Question