[Math] An intuitive explanation of how the mathematical definition of ergodicity implies the layman’s interpretation ‘all microstates are equally likely’.

definitionergodic-theoryintuitionstatistical mechanics

I'm self-studying Statistical Mechanics; in it I got Fundamental Postulate of
Statistical Mechanics
and that took me to ergodic hypothesis.

In the most layman's language, it says:

In an isolated system in thermal equilibrium, all microstates are equi-probable and equally-likely.

However, I could carry out my venture in Statistical Mechanics till now.

Lately, I came across the actual definition of ergodicity, especially that of Wikipedia:

[…] the term ergodic is used to describe a dynamical system which, broadly speaking, has the same behavior averaged over time as averaged over the space of all the system's states (phase space).

In statistics, the term describes a random process for which the time average of one sequence of events is the same as the ensemble average.

Wikipedia writes about ergodic hypothesis:

[…] over long periods of time, the time spent by a system in some region of the phase space of microstates with the same energy is proportional to the volume of this region …

Also, as Arnold Neumaier wrote about ergodic hypothesis:

[…] every phase space trajectory comes arbitrarily close to every phase space point with the same values of all conserved variables as the initial point of the trajectory ….

I couldn't get those mathematical definitions as those are beyond my level; still I tried to connect these definitions with the layman's one but couldn't do so. I know a bit of phase space, ensembles and nothing more.

I would appreciate if someone explains in an intuitive manner how the definitions the time average of one sequence of events is the same as the ensemble average and time spent by a system in some region of the phase space of microstates with the same energy is proportional to the volume of this region
imply the layman's interpretation of ergodic hypothesis.

Best Answer

It's useful to consider finite-state Markov chains with states $\{ 1, \ldots, N \}$. Such a Markov chain is defined by its transitions matrix $P = (P_{ij})_{i,j=1}^N$. We require that $0 \leq P_{ij} \leq 1$ for each $i, j = 1, \ldots, N$ and that $\sum_{j=1}^N P_{ij} = 1$. Thus, we can think of $P_{ij}$ as the probability of jumping from state $i$ to state $j$. We initialize the Markov chain in a state $X_0$ and let $X_n$ be the state at time $n$ (so $X_n$ is a random variable in $\{ 1, \ldots, N \}$).

A natural requirement is that the Markov chain be irreducible, which essentially means that we can get from any state to any other state with positive probability.

A finite-state Markov chain is said to be ergodic if it is irreducible and has an additional property called aperiodicity. The ergodic theorem for Markov chains says (roughly) that an ergodic Markov chain approaches its "stationary distribution" (see the previous link) as time $n \to \infty$.

Now in the case of physical systems, an additional assumption is usually that the system be reversible. It turns out that the stationary distribution of a finite-state irreducible reversible Markov chain is the uniform distribution, which assigns equal probability $1/N$ to each of the possible states.

Putting all this together, we see that a finite-state reversible ergodic Markov chain converges to the uniform distribution (i.e. reaches an equilibrium as time goes to infinity in which all states are equally likely).

The notion of ergodic dynamical system you asked about is a vast generalization of this idea.

Related Question