[Math] Understanding convergence of fixed point iteration

convergence-divergencederivativesfixed-point iterationnumerical methods

I was reading some slides explaining the convergence of the fixed point iteration, but honestly I'm not seeing or having an intuitive idea of how fixed-point iteration methods converge.

Assuming $p < 1$ as for the fixed point theorem, yields

$$|x_{k+1} – x^*| = |g(x_k) – g(x^*)| \leq p |x_k – x^*|$$

This is a contraction by a factor $p$. So

$$|x_{k+1} – x^*| \leq p|x_k – x^*| \leq p^2|x_{k-1} – x^*| \leq \cdots \leq p^{k+1}|x_{0} – x^*| \rightarrow 0$$

The smaller the $p$, the faster the convergence.

Could someone explain me this?

Also have another problem. There's then a statement saying

For root $x_1^*$ we have the conditions for fixed-point theorem holding $|g'(x)| < 0.4$, and we expect faster convergence than with the bisection methods.

Regarding this last statement, I would have a few questions.

  1. What's the relation between the definition above, and the derivative of $g$ being less than $0.4$?

  2. Why would it be faster than the bisection method? I know that the bisection method converges linearly, but honestly I still didn't have an intuitive idea of these convergence rates.

Best Answer

From your slides you have a contraction mapping $g$, i.e a function with the following property: $|g(x)-g(y)| \leq p\cdot|x-y|$ where $p < 1$ and this holds for all $x$ and $y$ in the domain of $g$. For a fixed point $x^*$ we must have $g(x^*) = x^*$ by the definition of a fixed point, and by the construction of the iterative process we have that $g(x_k) = x_{k+1} \forall k$. From this, the first line of your slide follows: $$\left|x_{k+1} - x^* \right| = \left| g(x_k) - g(x^*)\right| \leq p \cdot |x_k - x^*|$$ What this is saying, intuitively, is that each time we apply $g$ to $x_k$ we move a little closer to $x^*$ – the distance between the current iteration and the fixed point shrinks because of the contraction mapping.

The size of $p$ matters for the speed of the convergence because $p^n \rightarrow 0$ as $n\rightarrow \infty$ faster the smaller $p$ is. If you consider $p=0.01$ and $p=10^{-6}$ then it should be obvious that $10^{-6n}$ is shrinking faster than $10^{-2n}$.

For the rest, Hagen's answer is elegantly clear.

Related Question