Convergence rate of Gradient Descent

convex optimizationgradient descentoptimization

I was trying to solve a simple gradient descent problem. If we have $f(x) = x^2$, and a learning rate, $\eta$, that guarantees that the algorithm converges, then in how many steps will my algorithm be $\epsilon$ away from the minimum (which is $0$), i.e. how many steps are required to get $\vert x_i – x_{min} \vert < \epsilon $ which is basically $\vert x_i \vert < \epsilon$.

My first approach was to try break down $x_i$ and reformulate it in terms of $x_0$, because it the answer will obviously depend on my starting point, but I got somewhere a bit lost there. Does anyone have a hint to share?

Thank you!

Best Answer

For this simple problem, argue that the gradient descent update is $x_t = (1-2\eta) x_{t-1}$. Then, argue that $$ x_t = (1-2\eta)^t x_0. $$ Then, see if this sequence converges for all $\eta$ or if you need some condition on $\eta$ for that to happen.

Related Question