[Math] Does “approximate” gradient descent work

approximationgradient descentnumerical optimizationoptimization

In order to solve the convex problem

$$ \min_x f(x)$$

I use the so-called approximate gradient descent algorithm, whose $(k+1)$th iterate has the form

$$ x_{k+1} = x_k – \gamma \tilde{\nabla}f(x_k),$$
where $\tilde{\nabla}f(x_k)$ is an approximation of $\nabla f(x_k)$ and $\gamma > 0$ is some sufficiently small step size.


Question: Is it possible for one to (formally) prove that

$$f(x_{k+1}) \leq f(x_k),$$

or, more generally, that the approximate gradient descent "works". If so, how?

Since I am not using a stochastic approximation to the gradient, probabilistic arguments are not useful in this case.


Any directions, either to relevant papers or thoughts on how one could proceed with a proof (i.e., conditions on $\gamma$ or $\epsilon(x_k) = \tilde\nabla f(x_k) – \nabla f(x_k)$) would be helpful.

Best Answer

One option is to consider the dynamical system

$$ e_k = x_k - x^*, $$

where $x^*$ is the extremum.

Then, $e_{k+1} = e_k - \alpha \nabla f(x_k)$.

Take a function $V(e_k) = V_k := e_k^2$.

Then,

$$ V_{k+1} - V_k = \alpha^2 ( \nabla f(x_k) )^2 - 2 \alpha e_k \nabla f(x_k) $$

If it is possible to show that $V_{k+1} - V_k < 0, \forall e \ne 0$, then it is called a Lyapunov function for the system and in turn $e_k$ converges asymptotically to the equilibrium $e = 0$ whence $x = x^*$.

Observe that $e_k < 0 \implies \nabla f(x_k) < 0$ and $e_k > 0 \implies \nabla f(x_k) > 0$. So, if $\alpha$ is to be taken as positive, the condition

$$ \alpha_k < 2\frac{|e_k|}{|\nabla f(x_k)|}, e_k = 0 \implies \alpha_k := 0 $$

is a possible choice of an adaptive gain $\alpha_k$ which renders the equilibrium asymptotically stable.

If the gradient is approximate, then a condition that its sign coincides with the sign of $e_k$ suffices for the argument to work. Otherwise, you need to define the region where this condition holds. Its complement is the attractor. Indeed, if in some region the gradient changes sign arbitrarily, then no stability can be shown.