[Math] Intuition of Picard Iteration

analysisordinary differential equations

I understand the proof of the Picard-Lindelof Theorem, but am having trouble understanding why someone would attempt to use Lipschitz continuity in the first place. The condition of Lipschitz continuity seems ad hoc. What is the motivation of this on the intuitive level? This could be answered be answering one of the following questions:

Why would the failure to be Lipschitz continuous at a point allow for a break in solutions, that is, there is not uniqueness of solutions? (I am not looking for an example, but an intuitive understanding.)

Why would one go about looking for a solution through a Picard iteration, as in, why would someone have the intuition that the integral equation (induced by the differential equation) should contract approximations?

Best Answer

Before we begin I would like to bring your attention to three points.

( i ) The proofs we find in modern books are not the original one discovered by Picard and Lindelöf but refined versions of it. Back in 1894 they had a different mindset, different intuitions and much less readily available knowledge. Furthermore, the branch of functional analysis was in its early years, Lipschitz continuity was not yet known as Lipschitz continuity (in fact he was still alive) and Banach was just a baby.

( ii ) Picard iteration is proposed in chapter XI, section III, of (E. Picard) Traté D'Analyse II as an successive approximations method to show the existence (and implicitly the uniqueness) of solutions as an alternative method to Cauchy-Lipschitz, which is studied in detail in subchapters XI.I and XI.II. The content discussed in these two sections is very likely what motivated his own method; particularly XI.II, entitled "On an important property of Cauchy-Lipschitz method". (Nowadays Cauchy-Lipschitz theorem and Picard-Lindelöf theorem refer to the same theorem.)

( iii ) The kind of intuition you should expect is a more abstract and technically refined intuition which is found throughout functional analysis, since the argument used in the Picard-Lindelöf theorem is essentialy a functional analysis argument. Such kind of intuition indeed suggests "ad hoc conditions" in some situations, but have in mind that deep down they are very well educated guesses.

Said that, let's discuss a plausible modern path to craft the proof of uniqueness with the intention that at the right moment Lipschitzianity will look evident. We start with an open set $\mathcal{O} \subseteq \mathbb{R}^n$, an open interval $I \subseteq \mathbb{R}$, $ f: \mathcal{O} \times I \to \mathbb{R}^n $ continuous, $x : I \to \mathbb{R}^n$ differentiable, the problem in its differential form $$\begin{cases} \dot{x}(t) = f(x(t), t) \\ x(t_0) = x_0 \end{cases} \tag{1}$$

and its equivalent integral form $$ x(t) = x_0 + \int_{t_o}^t f(x(s), s)\ ds \tag{2}$$

(According to the hypotheses, there exists a local solution.)

Someone acquainted with functional spaces and maps between them is able note that the RHS of $(2)$ is one of these, namely $$ \mathcal{F} : X \to Y : x \mapsto \left(I \to \mathbb{R}^n : t \mapsto x_0 + \int_{t_o}^t f(x(s), s)\ ds \right) $$

where $X$ and $Y$ are suitable functional spaces. Thus $(2)$ is simply $$ x = \mathcal{F}(x) \tag{3} $$

This means that the function $x$ is a fixed point of the higher-order function $\mathcal{F}$. (Points in functional spaces are functions.)

With enough familiarity with complete metric or Banach spaces, as we want this fixed point to be unique, we start to wonder about Banach fixed-point theorem. Then if we could somehow manage to find a complete metric space invariant under $\mathcal{F}$, in the sense that, $\mathcal{F}(X) \subseteq X$, with $x \in X$ we could use it.

It does not take long to remember one of the most known spaces: the set of bounded functions, $\mathcal{B}(I, \mathbb{R}^n)$, that together with the supremum norm, $\| \cdot \|_{\infty}$, constitutes a Banach space, hence induces a complete metric space. Since $x$ is continuous, to bound it, we just take a compact subinterval $J \subset I$ and henceforth consider $X = Y = (\mathcal{B}(J, \mathbb{R}^n), \| \cdot \|_{\infty})$.

Skipping the finner details, as we try to show $\mathcal{F}(X) \subseteq X$ we find that $X$ and $Y$ should be restricted to the functions such that $\| x(t) - x_0 \| \le \beta$ for a suitable $\beta$, that is, the functions such that the distance to $x_0$ does not exceeds $\beta$. It turns out that such subspace still complete. Furthermore to make sure that solutions does not exit $f$'s domain we restrict it to $\overline{B}(x_0, \beta) \times J$.

Now we are certain that $X$ is an adequate complete metric space and $\mathcal{F}(X) \subseteq X$, we wonder if it is possible to $\mathcal{F}$ be a contraction. Right. We already got a very nice compact convex domain, there is not much else to ask here. So what to change to now? The only thing left is $f$'s continuity (which, by the way, is uniform now). Well, there are different levels of continuity. The usual types of continuity in compact sets are hierarchized as follows

  1. uniformly (globally) continuous
  2. locally Hölderian
  3. (globally) Hölderian
  4. locally Lipschitzian
  5. (globally) Lipschitzian
  6. bounded derivative
  7. continuously differentiable

Since showing that $\mathcal{F}$ is a contraction is showing the innequality

$$ \| \mathcal{F}(x)(t) - \mathcal{F}(y)(t) \| = \left\| \int_{t_0}^{t} f(x(s),s) - f(y(s), s)\ ds \right\| \le \lambda \|x(t) - y(t) \| \tag{4}$$

and integration interacts nicely with norms

$$ \left\| \int_{t_0}^{t} (f(x(s),s) - f(y(s), s))\ ds \right\| \le \left| \int_{t_0}^{t} \| f(x(s),s) - f(y(s), s) \| ds \right| \tag{5}$$

it is reasonable to wonder about a type of continuity that also relates norm and inequality. Particularly, $(4)$ and $(5)$ suggest Hölder or Lipshitz continuity. Since Hölder continuity does not seems convenient here because of its exponent, we choose (global) Lipshitz continuity. This immediately allows the following estimation

$$ \left| \int_{t_0}^{t} \| f(x(s),s) - f(y(s), s) \| ds \right| \le L \left| \int_{t_0}^{t} \| (x(s), s) - (y(s), s) \| ds \right| \tag{6}$$

which is enough to make $\mathcal{F}$ a contraction in these settings. We achieved our objetive.

Notice that Lipschitzianity is not needed in $t$, for it anulates itself in $(6)$. Furthermore, because a locally Lipschitzian function on a compact set is globally Lipschitzian and we will anyway restrict $f$ to a compact set, we just need $f$ to be locally Lipschitzian in $x$.

What makes this proof interesenting (and what made it popular) is that it elegantly yields a method for constructing the (locally, since we reduced $f$'s domain) unique solution through $\mathcal{F}$. In light of Banach's fixed-point theorem, we just choose any starting function in $X$, say $\gamma_0 \in X$, and iterates $\mathcal{F}$ over it: $\mathcal{F}^n(\gamma_0)$. Thus $\gamma = \lim_\limits{n \to \infty} \mathcal{F}^n(\gamma_0)$ exists and is (locally) unique. Therefore $\gamma$ is the (locally) unique fixed point of $\mathcal{F}$ and consequently the (locally) unique solution of $(1)$. We choose $\gamma_0(t) = x_0$ because it is the only function we are sure that belongs to $X$.

The sequence $\mathcal{F}^n(\gamma_0)$ is refered as Picard iteration because it exactly matches the original Picard successive approximations method. Here is important to emphasize that it is the uniqueness proof that produces "Picard's iteration" and not the other way around. However, what Picard (amazingly) managed to do, with the knowledge and mathematical maturity available at time, was to bypass this polished reasoning(which today is considered simple).

Now take a moment to contemplate how sharp your intuition must be to get through this reasoning without the structure provided by functional analysis.

Last but not least, non local Lipschitzianity does not necessarily allows non unique solutions. Uniqueness holds under a weaker condition, known as Osgood criterion. For more theorems on uniqueness see (R.P. Agarwal, V. Lakshmikantham) Uniqueness and Nonuniqueness Criteria for Ordinary Differential Equations. World Scientific (1993).

Related Question