Let $f: \mathbb R \to \mathbb R$ be a $C^1$ convex function, satisfying the growth conditions
$$\lim_{x \to -\infty} \nabla f(x) = -\infty, \lim_{x \to \infty} \nabla f(x) = \infty.$$
and let $\gamma_t: [0, \infty) \to \mathbb R$ be a deterministic, Borel measurable process satisfying the following conditions:
- $\gamma_t > 0$ for all $t \in [0, \infty)$.
- $\gamma_t \to 0$ as $t \to \infty$.
- $\int_0^\infty \gamma_t^2 \, dt < \infty$.
Consider the solution $X$ to the one dimensional SDE
$$dX_t = -\nabla f(X_t) \, dt + \gamma_t \, dW_t, \, \, X_0 = x_0$$
with $W$ a standard Brownian motion, and $x_0 \in \mathbb R$ an arbitrary initial condition.
Question: Is it true that for all initial conditions $x_0$, we have $\nabla f(X_t) \to 0$ as $t \to \infty$ almost surely?
Best Answer
here "An SDE perspective on stochastic convex optimization" they study these type of SDEs (Stochastic gradient descent) and basically ask bounded and an L2 condition for gamma_t in theorem 3.1.
Here in (H0) they also have extra constraints to $f$ that might be sharp.
As shown here a function can be convex and C1 but its gradient not be Lipschitz
And as shown here without this Lipschitz condition, we don't even have convergence for the deterministic gradient descent.