Practical meaning of Takens’ Theorem using your example
The butterlfly-like structure traced out by the trajectories of the Lorenz system is the attractor of this dynamics. Its properties contain useful information about the dynamics, e.g., that it’s chaotic and how the “wings” interact. In a typical situation you do not have access to all dynamical variables ($x$, $y$, and $z$), but only to one time series, say $z$.
Takens theorem now states that you can obtain a structure that is topologically equivalent to your attractor by means of a delay embedding. It further gives an upper bound for the required dimension of this embedding. However, this is not so useful in reality, as you do not know the quantities going into this. Also, this estimate is usually too high: For example, Lorenz attractor can be embedded with a three-dimensional delay embedding, while the Takens Theorem only guarantees that a seven-dimensional embedding suffices.
Clarification
I presume that at least some of your confusion stems from the following sentence from your second quote:
Takens has shown that embeddings with $d > 2n$ will be faithful generically
Were this written in analogy to your first quote, the relation would have to be $d>2D$. (Note that this is not incorrect though, since $D>n$.)
The equivalences between your first and second quote are as follows:
first quote |
second quote |
$M$ |
attractor |
$m$ |
$D$ |
– |
$n$ |
– |
$d_\text{e}$ |
Your questions
An $n^{th}$ order deterministic dynamical system, means that it has $n$ degrees of freedom? I don't understand what $n$ (or $m$ in the theorem actually is)?
You are correct regarding $n$. However, $n$ is not equal to the $m$ from the theorem. The closest equivalent to $n$ in your first quote is the dimension of some $ℝ^n$ into which $M$ is embedded.
The time-series lives on some $D$-dimensional attractor, so that would be equivalent to saying we are measuring some system and we record data of dimension $D$?
No. The dimension of the attractor is a property of the dynamics. It is independent of your number of actually measured observables.
For example, a limit-cycle dynamics has a one-dimensional attractor, as you can identify positions on the attractor with one real number¹, namely the phase. A quasiperiodic dynamics that is a superposition of two periodic dynamics with incommensurable frequencies has a dimension of two, as you need two phases to identify a position on the attractor. In general, the attractor is some subset of a $D$-manifold ($M$ in the first quote), which in turn is embedded in the $n$-dimensional state space of the dynamics (hence $D<n$). For example, for your Lorenz system, the butterfly-shaped structure traced out by the trajectories is the attractor.
I.e. imagine we are measuring some system of stock prices consisting of three different stocks, and we sample this price at every $\Delta t$, then $D=3$?
No, at best we have $n=3$ and that’s if those three stock prices interact with nothing else. If you have other external factors to consider, this adds degrees of freedom and thus increases $n$.
So assuming e.g. $n=4$, then as long as my $d_\text{e}=9$ or more I can accurately map from that space back to the measured space […]?
I think you mean the right thing, but I wouldn’t use the term measured space for the phase space or attractor, as the entire point of the Takens embedding is that you reconstruct a phase space or attractor that you cannot measure due to practical constraints.
Also note that in this statement you can replace $n$ by $D$ (see above) or even the box-counting dimension $D_B$ of the attractor (Theorem of Sauer, Yorke, and Casdagli).
¹ assuming that the number is mapped to the position in a reasonable (i.e., piecewise smooth) fashion
Best Answer
First of all, beware that all you can possibly obtain is the tangent vector of the reconstructed phase space.
I see two general cases:
Your sampling is sufficiently fine that you can estimate the tangent vector by subtracting two subsequent points of your reconstructed phase space (and possibly normalising with the sampling rate). For example, if your reconstructed phase-space vectors are $$\mathbf{y}(t) = \left( x(t),x(t-τ),x(t-2τ),… \right),$$ and $Δt$ is your sampling rate, you could estimate the tangent vector at $t+\tfrac{1}{2}Δt$ as: $$\mathbf{T}\left( t+\tfrac{1}{2}Δt \right) ≈ \frac{\mathbf{y}(t+Δt)-\mathbf{y}(t)}{Δt}$$
If your noise is very high, it may be reasonable to average over many points or apply some non-linear noise-reduction techniques first. These are mainly based on averaging over nearby points in phase space, assuming that the corresponding trajectory segments are parallel to the one you are interested in.
If your sampling rate is so coarse that you cannot use more than one reconstructed point per oscillation, you have to somehow associate these points to each other to reconstruct a phase-space trajectory. Some techniques for non-linear noise reduction rely on this anyway, so you may want to check them out. My ad-hoc suggestion would be to sort nearby points by an instantaneous phase obtained from a Poincaré section and assume that this represents the trajectory in the vicinity of your point of interest.
As for which method is best (even within the above two cases), I do not think there can be a general answer, as it depends on too many factors:
What your data is like (sampling rate, noise, …) affects the viability of methods and whether noise reduction is reasonable.
Where do you need your tangent vectors? A difference of subsequent phase-space points will give you a good estimate for the tangent vector in the middle of them (assuming a sufficiently fine sampling rate), but it won’t be such a good estimate for either phase-space point – here, a central difference may be better.
Related to the above, you’ll always have to make a trade-off between the accuracy of the tangent vector’s direction and its position. Again, a central difference yields you a high accuracy of position at the price of a lesser accuracy in direction, as compared to, e.g., forward difference.