[Math] derive the Boltzmann distribution by an invariance argument

mp.mathematical-physicsstatistical-physics

In statistical mechanics, the Boltzmann distribution gives the probability of a system being in state $i$ as

$$\displaystyle \frac{e^{- \beta E_i}}{\sum_i e^{-\beta E_i}}$$

where $E_i$ is the energy of state $i$. I have generally seen this demonstrated, starting with some reasonable physical assumptions, via a heat bath argument (as exposited e.g. by Terence Tao) involving interactions between the system and a larger external system. For me, an unsatisfying aspect of the heat bath argument is that it doesn't give me a strong reason to expect that a fundamental function like the exponential should appear at the end.

Here is what I think could be an argument which accomplishes that. By inspection, the Boltzmann distribution only depends on the relative energies of the different states. Under some mild assumptions this actually characterizes the Boltzmann distribution. Let us suppose there is a non-negative function $f(E)$ such that WLOG $f(0) = 1$ and such that the probability of a system being in state $i$ is

$$\displaystyle \frac{f(E_i)}{\sum_i f(E_i)}.$$

Let us suppose that the system has two states. Then the statement that the Boltzmann distribution only depends on the relative energies turns out to be equivalent to the functional equation $f(x + y) = f(x) f(y)$, which under any kind of continuity assumption whatsoever gives $f(x) = e^{ax}$ for some constant $a$.

Question 1: How can this argument be fleshed out? In particular, what physical principle would suggest that the Boltzmann distribution only depends on the relative energies of the states? (I seem to recall from my high-school physics lessons that energies are only well-defined up to an additive constant, but I would really appreciate some clarification on this issue.)

Question 2: How does this argument relate to the heat bath argument or the combinatorial argument given, for example, at Wikipedia?

(Motivation: some important functions in mathematics, like the Jones polynomial and various zeta functions, can be interpreted as partition functions of certain statistical-mechanical systems, and I am trying to sharpen my physical intuition about these constructions.)

Best Answer

Like Andreas, I find a maximum entropy argument to be intellectually appealing. However, he says the solution can be found by Lagrange multipliers and I don't know the justification for using Lagrange multipliers. That is, in the space of all probability distributions on the particles, how do you know the maximum entropy solution is really accessible to variational methods?

For a derivation not using Lagrange multipliers, see the bottom of page 9 through page 11 at http://www.math.uconn.edu/~kconrad/blurbs/analysis/entropypost.pdf.

Related Question