Statistical Mechanics – Understanding the Variational Principle for Canonical Ensemble

entropystatistical mechanicsvariational-principle

In all textbooks I know, the derivation of the canonical probability distribution starts from the microcanonical ensemble. In my opinion, this is more of a motivation than a proper derivation, since some hypothesis must be introduced, which might not account for all possible conceivable physical systems.

Alternatively, the second law of thermodynamics states that the variation of entropy of a closed system is $\Delta S \ge 0$. If $\Delta S \neq 0$, then the entropy of the system is changing and, consequently, it is not in equilibrium. The only way to stop this is to attain its maximum, in which case the variation stops and equilibrium is attained. Consequently, an equivalent condition for equilibrium is that the state minimizes the free energy $F = U -TS$.

Consider a system with fixed volume $\Lambda \subset \mathbb{R}^{d}$, temperature $T$ and number of particles $N$. The configuration space is $\Gamma_{\Lambda}:= (\Lambda\times \mathbb{R}^{d})^{N}$, and the system is described by some Hamiltonian $H_{\Lambda,\beta,N}$, with $\beta = 1/T$ as usual. Since $\Gamma_{\Lambda}$ is a metric space, let it be equipped with its Borel $\sigma$-algebra $\mathbb{B}_{\Lambda}$ and let $\mathcal{M}$ be the set of all probability measures $\mu$ on $(\Gamma_{\Lambda}, \mathbb{B}_{\Lambda})$ which are absolutely continuous with respect to the Lebesgue measure on $\mathbb{R}^{2dN}$. In other words, every $\mu \in \mathcal{M}$ is of the form $d\mu = \rho dx$ for some positive real-valued measurable function $\rho$.

Given a measure $\mu \in \mathcal{M}$, the "number of accessible states" of the system is $\mu(\Gamma_{\Lambda})$, so it seems natural to define the entropy $S_{\Lambda, \beta, N}$ of this system by:
$$S_{\Lambda,\beta,N}(\mu) := -k_{B}\ln \mu(\Gamma_{\Lambda})$$

My question is: Is this scenario correct and consistent with the canonical ensemble? And if so, is the canonical ensemble distribution $d\mu_{\Lambda,\beta,N} = \frac{1}{Z_{\Lambda,\beta,N}}e^{-\beta H_{\Lambda,\beta,N}}dx$ the solution of the following variational principle:
$$\inf_{\mu \in \mathcal{M}}(\mathbb{E}_{\mu}[H_{\Lambda,\beta,N}] – TS_{\Lambda,\beta,N}(\mu)),$$
where $\mathbb{E}_{\mu}[\cdot]$ denotes the expectation with respect to the measure $\mu$?

Putting in another words, can we obtain the canonical distribution from a variational principle minimizing the free energy?

Best Answer

Let me discuss the simplest possible setting, in which the set of possible states $\Omega$ is finite. Think, for instance, of a finite-volume Ising model (or any other finite system, each of whose variables take only finitely many values).

Let $\mu\in\mathcal{M}(\Omega)$ be a probability measure on $\Omega$ and denote by $H:\Omega\to\mathbb{R}$ the energy. In this case, the entropy is naturally defined as $$ S(\mu) = -k_B\sum_{\omega\in\Omega} \mu(\omega)\ln\mu(\omega). $$ (Note that you don't want to define it as $S(\mu)=-k_B\ln\mu(\Omega)$, which in any case would be equal to $0$ since $\mu$ is a probability measure. The fact that $\Omega$ represents all possible microstates does not make such a definition reasonable, since these microstates correspond in general to different values of the energy. The same is true of the definition you propose to use: the correct definition of the entropy would involve the relative entropy of $\mu$ with respect to the Lebesgue measure, $S(\mu) = -k_B\int \mu(dx) \ln(d\mu/dx)$, where $d\mu/dx$ denotes the Radon-Nykodim derivative.)

You then have $$ \mathbb{E}_\mu[H] - T S(\mu) = \sum_{\omega\in\Omega} \bigl( H(\omega) + \frac{1}{\beta} \ln\mu(\omega) \bigr) \mu(\omega) = -\frac{1}{\beta}\sum_{\omega\in\Omega} \ln\Bigl[\frac{e^{-\beta H(\omega)}}{\mu(\omega)}\Bigr] \mu(\omega) . $$ By Jensen's inequality, $$ \sum_{\omega\in\Omega} \ln\Bigl[\frac{e^{-\beta H(\omega)}}{\mu(\omega)} \Bigr]\mu(\omega) \leq \ln \biggl[ \sum_{\omega\in\Omega} \frac{e^{-\beta H(\omega)}}{\mu(\omega)} \mu(\omega) \biggr] = \ln \sum_{\omega\in\Omega} e^{-\beta H(\omega)} = \ln Z_\beta, $$ where $Z_\beta = \sum_{\omega\in\Omega} e^{-\beta H(\omega)}$ is the partition function. Moreover, the inequality becomes an equality if and only if the function $e^{-\beta H(\omega)}/\mu(\omega)$ is constant, that is, if and only if $$ \mu(\omega) = \frac{1}{Z_\beta} e^{-\beta H(\omega)}. $$ We conclude that $$ \inf_{\mu\in\mathcal{M}(\Omega)} \bigl(\mathbb{E}_\mu[H] - T S(\mu) \bigr) $$ is equal to the free energy $F(\beta)=-kT\ln Z_\beta$ and is reached exactly when $\mu$ is the Gibbs measure.

Some remarks:

  • All the above extends to the case of particles in $\Lambda\subset\mathbb{R}^d$. See, for instance, ยง5.5 in Gallavotti's book.
  • Things become much more interesting in the thermodynamic limit (that is, for an infinite system of particles, or of spins, say). Indeed, in that case, there is, in general, not a unique Gibbs measure. The point is that a version of the above still holds nevertheless, at least for translation-invariant Gibbs measures. This is discussed in many places, for instance in Section 6.9 of our book. You can actually go even further and show that translation-invariant Gibbs measures can be identified with the tangent functionals to the pressure, seen as a functional in the space of all interactions; see Chapter 16 in Georgii's book, for instance.
Related Question