Numerical Methods – Why Does Fixed Point Iteration Work?

fixed-point-theoremsnumerical methods

I have searched online for an answer, but everyone gave the method, and no one explained why is it working.
I'll first write what I do understand.

Let $f(x)$ be a continuous function at $[a,b]$.
suppose $f$ has a root at $[a,b]$, and we want to find it. meaning, we want to find $x_0$ which satisfies $f(x_0)=0$. so we rewrite $f$ as $g(x)=x$, and now the problem of finding the root for $f$, reduced to finding the fixed point for $g$.
For example: let $f(x)=x^2-x-1$, so we're interested in $x^2-x-1=0$. rewriting it, we get $x=\frac{1}{x-1}$.
now define $g(x)=\frac{1}{x-1}$, and now the problem was reduced to finding a fixed point of $g$, meaning finding $x_0$ which satisfies $g(x_0)=x_0$.
Now, to find that $x_0$, the method states that one should pick some point $x_1$, evaluate $g(x_1)$, and then evaluate $g(g(x_1))$, and so on, or in other words, we evaluate:
$x_{n+1}=g(x_n)$ for $n=1,2,3…$ until we get the desired accuracy.
What I don't get is; why does it work?
And more specifically; Why is evaluating $g(g(g(…g(x))))$ yields the fixed point of $g$?
I know there are some circumstances in which this doesn't work (whatever they are, I don't know…), but I'm referring to those that it does.
Another (related?) question is: why is $g$ undefined at $x=1$? does it just happen to be this way, or does it have some implications regarding $f$'s root?

Thanks.

Best Answer

Let us assume that function $g$ is defined on an interval $(a,b)$, $g(x)\in(a,b)$ in that interval, and that there is a constant $c<1$ such that for each $x,y \in (a,b)$, \begin{equation} \left|g(y)-g(x)\right| < c |x-y|. \tag{1} \end{equation} If $g$ has a derivative, this becomes $g'(x)<c$ for $x\in(a,b)$.

The fixed point iteration is defined by $x_{k+1}=g(x_k)$, where $x_0$ is an arbitrarily chosen starting point in $(a,b)$.

Let us assume that the function has a fixed point at $\hat{x}\in(a,b)$, that is $\hat{x}=g(\hat{x})$.

Now at step $k$, the absolute error of our current guess to the fixed point is $e_k = |x_k-\hat{x}|$. We get $$ e_{k+1} = |x_{k+1}-\hat{x}| = |g(x_k)-g(\hat{x})| < c|x_k - \hat{x}| = c e_k. $$

Therefore, the sequence $(e_k)_k$ is nonnegative and bounded above by the sequence $(c^ke_0)_k$, which converges to $0$. Therefore, $\lim_{k\to\infty}e_k=0$. This means that the fixed point iteration converges to $\hat{x}$.


For general $g:\mathbb{R}\to\mathbb{R}$, we can make following observations:

If (1) holds in $\mathbb{R}$, we can replace $(a,b)$ with $\mathbb{R}$ in the above proof. One can also see that the function has exactly one fixed point in that case (if $g$ is differentiable, the derivative of $g(x)-x$ is smaller than a negative constant, thus $g(x)-x$ has exactly one zero; if $g$ is not differentiable, a similar argument still holds).

If (1) does not hold in $\mathbb{R}$ but holds in an interval $(a,b)$ containing a fixed point, we can see that $g(a)>a$ and $g(b)<b$, so $g(x) \in (a,b)$ as required. Now the fixed point iteration converges to the fixed point if $x_0$ is chosen inside the interval.

Related Question