[Math] Understanding the definition of ergodicity through examples

averageprobability theorystatisticsstochastic-processes

I am taking a course on Communication Systems (from an engineering point of view). While I'm usually very interested in the formal mathematics, this time I would like to avoid it, since I don't have a good background on probability theory; also, many of my colleagues have little to no interest in formal mathematics at all, and I would like to understand this concept in a way that would be easy to propagate to them.

So, apparently to understand the meaning of ergodicity, one needs to know what is the ensemble average and what is the time average of a random process. After reading this answer on Math.SE, and three related entries on Wikipedia (Ergodic Process, Ergodicity, Stationary ergodic process), this is what I understand:

  • A random process is like a random variable, but it's outcomes are "waveforms" (a.k.a. functions) instead of numbers.

  • The ensemble average is the average of the outcomes of the random process, and therefore is another function (waveform) by itself. A given random variable will have one ensemble average (one function).

  • As opposed to the ensemble average, a random process can have many (possibly infinitely many) time averages, since every outcome of the random process (i.e., every waveform) has its own time average, which is the average value of the waveform. That is, given an outcome $x(t)$, it's time average will be given by

$$\lim_{T \to \infty} \dfrac{1}{T} \int_{-\frac{T}{2}}^{\frac{T}{2}}x(t)dt$$

After reading those wikipedia pages, it seems that an ergodic process is a process that satisfies "the time average is equal to the ensemble average". But, which time average? To my understanding there are many time averages. Which one? All of them? Their mean? At least one of them? Or what?


Also, I would like to take a better look on the following examples, found in the linked wikipedia pages:

Example 1.

Suppose that we have two coins: one coin is fair and the other has two heads. We choose (at random) one of the coins, and then perform a sequence of independent tosses of our selected coin. Let X[n] denote the outcome of the nth toss, with 1 for heads and 0 for tails. Then the ensemble average is ½  (½ +  1) = ¾; yet the long-term average is ½ for the fair coin and 1 for the two-headed coin. Hence, this random process is not ergodic in mean.

Is this correct? (of course my doubt here is a consequence of the fact that I don't know the answer for my bold question above).

Example 2.

Ergodicity is where the ensemble average equals the time average. Each resistor has thermal noise associated with it and it depends on the temperature. Take N resistors (N should be very large) and plot the voltage across those resistors for a long period. For each resistor you will have a waveform. Calculate the average value of that waveform. This gives you the time average. You should also note that you have N waveforms as we have N resistors. These N plots are known as an ensemble. Now take a particular instant of time in all those plots and find the average value of the voltage. That gives you the ensemble average for each plot. If both ensemble average and time average are the same then it is ergodic.

It says "take a particular instant of time in all those plots". Does any instant works? Or rather, do I have to take "all of them" (one at a time)? Taking only one instant of time doesn't seem right… Rather, shouldn't I be taking some sort of limit to infinity? Also, it refers to "the time average" as if it was only one, but to my understanding there are N different time averages here, since there are N waveforms.

Best Answer

Your question is intuitive, so I will try to answer through one very intuitive example.

Example:


Dynamics: You have some money, say 100$, and we're playing a coin game with a fair coin. Each time the coin lands heads (H) we take your money and multiply them by 1.5 and every time the coin lands tails (T) we take your money and multiply them by 0.6.

Averages: Let W(t) denote your wealth at time t, and let this process run for a finite time and take the average over your wealth for that time. This is the finite time-average:

$\left\langle W(t)\right\rangle _{T} = \frac{1}{T}\sum_{t=0}^{T}W(t) $

In contrast, assume that we have N of these processes that we let run until t and then take the average over N. This is the finite ensemble-average of the observable N wealth processes.

$\left\langle W(t)\right\rangle _{N} = \frac{1}{N}\sum_{i=1}^{N}W_i(t)$

Where i denotes the ith of N processes and the average is taken at time t. Letting $T\rightarrow\infty$ and $N\rightarrow\infty$ you get the ensemble average (or expectation operator) and the time average.


Now that we have introduced the relevant dynamics (the coin game) and the averages, lets restate your definition of ergodicity - namely that the time average equals the ensemble average.

1) You are correct in asserting that each of the N coin tossing sequences have a time-average. However, in the limit, all of those N time averages are equal exactly because they are governed by the same dynamic. And they will all converge to the absorbing boundry (zero) with probability one. If you find that unintuitive, try to simulate a bunch of those processes, and plot their distribution. What you will get is a log-normal with diverging moments. E.g. there will be one-in-a-million who's wealth grows exponentially. The bulk of the ensemble will be close to zero.

2) Is the above wealth generating process ergodic? No. The ensemble average will predict infinite positive growth$^*$ while the time-average will converge to zero.$^{**}$ This is commonly known as the St. Petersburg paradox. Unrelated to your question, but both interesting and important: It is possible to create an ergodic observable using the logarithm which solves the St. Petersburg paradox.

Hope you can use this. If you want to see the formal proofs and simulations, I can recommend:

$^{*}$ Ensemble average: $\frac12\times0.6+\frac12\times1.5=1.05$ a number larger than one, reflecting positive growth of the ensemble.

$^{**}$ Time average: $x(t)=r_1^{n_1}r_2^{n_2}$ where $r_1$ and $r_2$ are the two rates and $n_1$ and $n_2$ are the frequences that the wealth process gets subjected to the rates. The limit of $x(t)$ is then for $t\rightarrow\infty$ $x(t)^{1/t} = (r_1r_2)^{1/2}$ or $\sqrt(0.9)\approx0.95$ e.g. a number less than one, which is decay in the long time limit.

Video

Article

Related Question