I was trying to solve a simple gradient descent problem. If we have $f(x) = x^2$, and a learning rate, $\eta$, that guarantees that the algorithm converges, then in how many steps will my algorithm be $\epsilon$ away from the minimum (which is $0$), i.e. how many steps are required to get $\vert x_i – x_{min} \vert < \epsilon $ which is basically $\vert x_i \vert < \epsilon$.
My first approach was to try break down $x_i$ and reformulate it in terms of $x_0$, because it the answer will obviously depend on my starting point, but I got somewhere a bit lost there. Does anyone have a hint to share?
Thank you!
Best Answer
For this simple problem, argue that the gradient descent update is $x_t = (1-2\eta) x_{t-1}$. Then, argue that $$ x_t = (1-2\eta)^t x_0. $$ Then, see if this sequence converges for all $\eta$ or if you need some condition on $\eta$ for that to happen.