Understanding the Koopman Operator – Dynamical Systems and Nonlinear Systems

dynamical systemsergodic-theoryfluid dynamicskoopman-operatornonlinear system

Consider a continuous time dynamical system

$$\dot x(t) = F(x(t)),$$

where $x(t)$ is a coordinate vector of state and the right side of the equation $F$ is a non-linear smooth function. Let the state space be Euclidean. Let $S^t(x_0)$ be the position of the trajectory of

$$\dot x(t)= F(x(t))$$

at time $t$. We call $S^t(x_0)$ the flow generated by $\dot x(t) = F(x(t))$.

  1. Why do we care about $S^t(x_0)$? Isn't the equation of the dynamics enough to study the dynamics?

  2. Let's say the state space is $\mathbb{R}$ and

$$\dot x(t)=\frac{1}{x(t)},\quad x(0)=x_0=2.$$

Could any one explain to me just to make me understand, what would be the $S^t(x_0)$ in this case? Will it be the set $\{(x(t), \dot x(t)): t\in [0,\infty)\}$ on $\mathbb R^2$. Thanks for your help.

Let $f$ be any function from $\mathbb{R}\to \mathbb{C}$, say $f(x)=e^{ix}$. We call $f$ an observable of the dynamical system stated above and the value of $f$ is observed on the trajectory starting from $x_0$ and changes as time changes and is defined as

$$f(t,x_0)=f(S^t(x_0)).$$

  1. So from the above expression, it looks like $S^t(x_0)$ is nothing but a co-ordinate point of the domain of the function right, in this case, some real number, right?

The space of all observables forms a vector space $V$ under some assumptions and we can define a family of operators $U^t$ on this vector space as

$$U^t: V\to V;\quad U^t(f)(x_0)=f(S^t(x_0)).$$

  1. If you could explain the above definition by my above example of a dynamical system that would be great for me to understand what's really going on here. Why care about $x_0$ in particular here?

For reference, I was trying to understand the Koopman operator for the non-linear dynamical system from Arbabi & Mezić' article "Ergodic theory, Dynamic Mode Decomposition and Computation of Spectral Properties of the Koopman operator".

Best Answer

I would like to provide a somewhat more in depth answer.


I think part of the confusion is due to the choice of notation, so here I'll introduce the setup with new (though compatible) notation. First let $M$ be a smooth manifold (see e.g. the description in What is Riemannian Manifold intuitively? ). $M$ models the closed system that will evolve under a certain time evolution; let's call it the configuration space. Any point in $M$ represents a position. $M$ looks locally like a Euclidean space, but globally it's shape is typically much more complicated (e.g. consider the surface of an actual donut with the bumps and dents as can be seen with the human eye, or the cornea of a human eye). In particular it might well be that we are a priori unaware of what $M$ globally looks like. (For technical reasons below I'm assuming $M$ is compact ($\ast$) and $C^\infty$.)

There are many ways of introducing a time evolution on $M$; a classical way to do it is to consider an ordinary differential equation on it. To any smooth manifold $M$ one can associate another smooth manifold $TM$, called the tangent bundle of $M$. Any point in $TM$ is of the form $(p,v)$, where $p\in M$ is a position and $v$ is a velocity. Let's call $TM$ the state space. To define an ODE on $M$ is to define a vector field on $M$; which formally is a continuously differentiable function $F:M\to TM$ that is of the form $F(p)=(p,?)$. So $F$ attaches to each point $p$ of $M$ some velocity vector. Then the ODE on $M$ defined by $F$ is of the form

$$x'=F(x),$$

where $x:\mathbb{R}\to M$ is the unknown function. The classical Existence and Uniqueness Theorem in ODE applies in this situation, and says that for any initial condition $p\in M$, there is a unique path $\gamma:\mathbb{R}\to M$ with the property that

$$\gamma(0)=p, \forall t\in\mathbb{R}:\dfrac{\partial \gamma}{\partial t}(t)=F(\gamma(t)).$$

In words, $\gamma$ is at position $p$ at time $0$, and it's velocity at time $t$ is precisely the vector $F(\gamma(t))$, so that $F$ is tangent to $\gamma$. Now note that to produce this path $\gamma$ we had to fix the initial condition, but we had no restriction on which initial condition to choose. Thus more precisely we also have that $\gamma$ is a function of $p$:

$$\gamma(t)=\gamma(t,p).$$

Thus we have that $\gamma: \mathbb{R}\times M\to M$. Further, again by the Existence and Uniqueness Theorem we have that $\gamma$ satisfies the group property:

$$\forall p\in M,\forall t_1,t_2\in \mathbb{R}: \gamma(t_1+t_2,p)=\gamma(t_1,\gamma(t_2,p)).$$

In words, starting at an anonymous $p$ and flowing along $F$ for $t_1+t_2$ time is the same as starting at the same $p$, first flowing along $F$ for $t_2$ time to reach some (possibly different) point $q=\gamma(t_2,p)\in M$ and then flowing along $F$ for $t_1$ time starting from $q$. Let's call $\gamma$ the flow of the vector field $F$.

Another way to think of $\gamma: \mathbb{R}\times M\to M$ is to think of it as an association, to each time $t$, a map of $M$ into itself:

$$\gamma_\bullet:\mathbb{R}\to [M\to M], t\mapsto [p\mapsto \gamma(t,p)].$$

This perspective is useful e.g. when one has a specific time parameter $t^\ast$ and wants to consider where an anonymous point $p$ went in $t^\ast$ time. Now the group property can be reformulated by saying that $\gamma_\bullet:\mathbb{R}\to\operatorname{Diff}^1(M)$ is a group homomorphism ($\dagger$). One can thus think of $\gamma_\bullet$ a non-linear representation of the time space $\mathbb{R}$.

The final piece is to introduce an observable $\Phi$ on $M$, which for our purposes will be a function $\Phi: M\to \mathbb{C}$ (with typically certain regularity properties). Introducing observables is useful, e.g. because instead of tracking the time evolution of $p$ in $M$ it is easier to track the time evolution of the complex number $\Phi(\gamma(t,p))$ as $t$ varies. It might also be the case that even though $M$ is unknown a certain "1D aspect" of the configuration space is known. Denote by $\mathcal{O}(M)$ the space of all observables on $M$; it's straightfoward that $\mathcal{O}(M)$ is a vector space. Further, for any observable $\Phi \in\mathcal{O}(M)$, we can think of $t\mapsto \Phi\circ \gamma(t,\bullet)$ as a time evolution on $\mathcal{O}(M)$; so that $\mathcal{O}(M)$ becomes a configuration space in its own right. Here the observable $\Phi\circ \gamma(t,\bullet)\in\mathcal{O}(M)$ is defined like so: first take an anonymous point $p\in M$, then flow along $F$ for $t$ time, then read off the value $\Phi$ gives to $\gamma(t,p)$. Let us use the abbreviation $U(t,\Phi)=\Phi\circ \gamma(t,\bullet)$ to emphasize that we have a new configuration space and a time evolution (both uniquely determined by $\gamma$, which in turn is uniquely determined by $F$):

$$U:\mathbb{R}\times\mathcal{O}(M)\to\mathcal{O}(M).$$

The group property of $\gamma$ translates to a group property of $U$. Further, for each $t\in \mathbb{R}$, $U(t,\bullet)$ is a linear operator $\mathcal{O}(M)$. Thus in analogy with ($\dagger$) above, we have a group homomorphism $U_\bullet: \mathbb{R}\to \operatorname{GL}(\mathcal{O}(M))$, which we can think of as a linear representation of the time space $\mathbb{R}$. $U_\bullet$ is called the Koopman flow of $F$.


On to the answers to the listed questions.

    • In the vector field formalism the time evolution is not explicit; it is implied.

  1. In my terminology, for $x'=1/x$, $M=\{p\in\mathbb{R}\,|\, p\neq0\}$ is the configuration space, as it parameterizes only positions, and $TM\cong M\times\mathbb{R}$ is the state space. The vector field is $F:x\mapsto (x,1/x)$ (the reason I've restricted the configuration space, which is what you call the state space, is to make $F$ well-defined everywhere). Sending the $x$ on the RHS we get $xx'=1$. The LHS is equal to $(x^2/2)'$, so that by integrating both sides we get $x^2/2=t+C$, where $C$ is a constant. Plugging in $t=0$ we get $(x(0))^2/2=C$, so that for any time $t\in\mathbb{R}$ and for any initial position $p=x(0)\in M$, the (local) flow $\gamma$ of $F$ is:

$$\gamma(t,p)=\operatorname{sign}(p)\sqrt{2t+p^2}.$$

(Note that in this case $M$ is not compact (nor is $F$ compactly supported), whence the flow of $F$ is not defined for all $t\in\mathbb{R}$; indeed the domain of definition of $\gamma$ is $\{(t,p)\in\mathbb{R}\times M\,|\, t>-p^2/2\}$; see also ($\ast$) above.)

(See also Basic question about finding flow given eigenvalues for another example.)


  1. Yes.

  1. Here are some examples of observables in $\mathcal{O}(M)$ for the example in item 2:
  • $f_1(p)=e^{ip}$ (which is your example)
  • $f_2(p)=0$ (this is the trivial observable)
  • $f_3(p)= |p-1|$ (this observable reads the distance to $1\in M$)
  • $f_4(p)=\begin{cases} 1, &\text{ if }1\leq p\leq 2\\ 0, &\text{ otherwise }\end{cases}$ (this observable is a sensor; it detects if $p$ is in $[1,2]\subseteq M$).

Looking the time evolutions of these observables on $M$ (i.e. points of $\mathcal{O}(M)$) under the Koopman flow, we get the following:

  • $U(t,f_1)(p)=f_1(\gamma(t,p))=e^{i\operatorname{sign}(p)\sqrt{2t+p^2}}$
  • $U(t,f_2)(p)=f_2(\gamma(t,p))=0$
  • $U(t,f_3)(p)=f_3(\gamma(t,p))= |\operatorname{sign}(p)\sqrt{2t+p^2}-1|$
  • $U(t,f_4)(p)=f_4(\gamma(t,p))=\begin{cases} 1, &\text{ if }1\leq \operatorname{sign}(p)\sqrt{2t+p^2}\leq 2\\ 0, &\text{ otherwise }\end{cases}$.

Each one of these ought to be interpreted accordingly; e.g. the time evolution of the observable $f_4$ describes the trajectories of which point hit the interval $[1,2]$ and how long.

The notation $U_t(f)$ makes sense without any reference to a point on the original configuration space $M$; though $U_t(f)$ is a function on $M$ and the only way to describe what a function is is to describe what it does to an anonymous $p$:

$$U_\bullet: \mathbb{R}\to\operatorname{GL}(\mathcal{O}(M)), t\mapsto [f\mapsto [p\mapsto f(\gamma(t,p))]].$$


There is a final question that the OP didn't ask but is relevant; which is the benefit of switching to the Koopman flow. For instance $M$ is often finite dimensional but non-linear with a non-linear time evolution $\gamma$. Koopmanizing we get the space $\mathcal{O}(M)$ (which has different versions; see e.g. Constructing unitary representations using quasi-invariant measures, Ergodicity of surjective continuous endomorphism of compact abelian group (confused about a step)) is (often) infinite-dimensional but linear with a linear time evolution $U$. Thus we trade finite-dimensionality for linearity. If $\mathcal{O}(M)$ can be efficiently truncated to a finite-dimensional subspace in a way adapted to $U$, then one obtains a linearization of the original time evolution with no finite dimensionality cost.

See also the discussions at Mathematical framework of the Koopman operator, Importance of Group Representation theory, What is Representation Theory?, Importance of Representation Theory (where by "representation" a linear (to say the least) representation is meant).

Related Question