[Physics] Motivation for the use of Tsallis entropy

entropymathematical physicsstatistical mechanics

Every now and again I hear something about Tsallis entropy,
$$
S_q(\{p_i\}) = \frac{1}{q-1}\left( 1- \sum_i p_i^q \right), \tag{1}
$$
and I decided to finally get around to investigating it. I haven't got very deep into the literature (I've just lightly skimmed Wikipedia and a few introductory texts), but I'm completely confused about the motivation for its use in statistical physics.

As an entropy-like measure applied to probability distributions, the Tsallis entropy has the property that, for two independent random variables $A$ and $B$,
$$
S_q(A, B) = S_q(A) + S_q(B) + (1-q)S_q(A)S_q(B).\tag{2}
$$
In the limit as $q$ tends to $1$ the Tsallis entropy becomes the usual Gibbs-Shannon entropy $H$, and we recover the relation
$$H(A,B) = H(A) + H(B)\tag{3}$$
for independent $A$ and $B$.

As a mathematical property this is perfectly fine, but the motivation for its use in physics seems completely weird, unless I've fundamentally misunderstood it. From what I've read, the argument seems to be that for strongly interacting systems such as gravitationally-bound ones, we can no longer assume the entropy is extensive (fair enough so far) and so therefore we need an entropy measure that behaves non-extensively for independent sub-systems, as in Equation $2$ above, for an appropriate value of $q$.

The reason this seems weird is the assumption of independence of the two sub-systems. Surely the very reason we can't assume the entropy is extensive is that the sub-systems are strongly coupled, and therefore not independent.

The usual Boltzmann-Gibbs statistical mechanics seems well equipped to deal with such a situation. Consider a system composed of two sub-systems, $A$ and $B$. If sub-system $A$ is in state $i$ and $B$ is in state $j$, let the energy of the system be given by $E_{ij} = E^{(A)}_i + E^{(B)}_j + E^{(\text{interaction})}_{ij}$. For a canonical ensemble we then have
$$
p_{ij} = \frac{1}{Z} e^{-\beta E_{ij}} = \frac{1}{Z} e^{-\beta \left(E^{(A)}_i + E^{(B)}_j + E^{(\text{interaction})}_{ij}\right)}.
$$
If the values of $E^{(\text{interaction})}_{ij}$ are small compared to those of $E^{(A)}_i$ and $E^{(B)}_j$ then this approximately factorises into $p_{ij} = p_ip_j$, with $p_i$ and $p_j$ also being given by Boltzmann distributions, calculated for $A$ and $B$ independently. However, if $E^{(\text{interaction})}_{ij}$ is large then we can't factorise $p_{ij}$ in this way and we can no longer consider the joint distribution to be the product of two independent distributions.

Anyone familiar with information theory will know that equation $3$ does not hold for non-independent random variables. The more general relation is
$$
H(A,B) = H(A) + H(B) – I(A;B),
$$
where $I(A;B)$ is the mutual information, a symmetric measure of the correlation between two variables, which is always non-negative and becomes zero only when $A$ and $B$ are independent. The thermodynamic entropy of a physical system is just the Gibbs-Shannon entropy of a Gibbs ensemble, so if $A$ and $B$ are interpreted as strongly interacting sub-systems then the usual Boltzmann-Gibbs statistical mechanics already tells us that the entropy is not extensive, and the mutual information gets a physical interpretation as the degree of non-extensivity of the thermodynamic entropy.

This seems to leave no room for special "non-extensive" modifications to the entropy formula such as Equation $1$. The Tsallis entropy is non-extensive for independent sub-systems, but it seems the cases where we need a non-extensive entropy are exactly the cases where the sub-systems are not independent, and therefore the Gibbs-Shannon entropy is already non-extensive.

After that long explanation, my questions are: (i) Is the above characterisation of the motivation for Tsallis entropy correct, or are there cases where the parts of a system can be statistically independent and yet we still need a non-extensive entropy? (ii) What is the current consensus on the validity of Tsallis entropy-based approaches to statistical mechanics? I know that it's been the subject of debate in the past, but Wikipedia seems to imply that this is now settled and the idea is now widely accepted. I'd like to know how true this is. Finally, (iii) can the argument I sketched above be found in the literature? I had a quick look at some dissenting opinions about Tsallis entropy, but surprisingly I didn't immediately see the point about mutual information and the non-extensivity of Gibbs-Shannon entropy.

(I'm aware that there's also a more pragmatic justification for using the Tsallis entropy, which is that maximising it tends to lead to "long-tailed" power-law type distributions. I'm less interested in that justification for the sake of this question. Also, I'm aware there are some similar questions on the site already [1,2], but these don't cover the non-extensivity argument I'm concerned with here the answers only deal with the Rényi entropy.)

Best Answer

(i) Is the above characterisation of the motivation for Tsallis entropy correct, or are there cases where the parts of a system can be statistically independent and yet we still need a non-extensive entropy?

The one example I can think of that fits this description is a collisionless plasma (well, at least weakly collisional), like the solar wind.

Over scales larger than the Debye length, the system behaves in a collective manner but the collisionless nature of the gas keeps it from reaching equilibrium. Further, even though electromagnetic fields produce long-range interactions, the "parts" of the system (e.g., Debye spheres) can still be statistically independent. This allows a collisionless plasma to behave according to a non-extensive kinetic theory.

(ii) What is the current consensus on the validity of Tsallis entropy-based approaches to statistical mechanics? I know that it's been the subject of debate in the past, but Wikipedia seems to imply that this is now settled and the idea is now widely accepted. I'd like to know how true this is.

I think the validity of Tsallis entropy is generally accepted, at least in space plasma physics [e.g., see Livadiotis, 2015]. The support for a non-Maxwell-Boltzmann theory arose because of the continual observation of velocity distributions (e.g., Maxwellian) that had power-law tails and the lack of observations of Maxwellians. Initial attempts to model these distributions included superpositions of modified Lorentzian distributions (e.g., similar to Cauchy distributions) with Maxwellians [e.g., Feldman et al., 1983; Thomsen et al., 1983]. Later studies [e.g., Maksimovic et al., 1997] resurrected an old form called a kappa distribution, which was originally derived by Vasyliunas [1968]. Eventually, Leubner [2002] showed the connection between the kappa distribution and the Tsallis distribution when $\kappa = -1/\left( q - 1 \right)$, where $q$ is the entropic parameter from Tsallis statistics (Note that the kappa distribution is a member of the modified Lorentzian distributions).

More recently, a great deal of work has started to solidify the relationship between kappa distributions and Tsallis statistics and fundamental thermodynamics. In recent years a lot of work on this topic has been published that attempts to merge the more traditional statistical mechanics with non-extensive statistical mechanics [e.g., Livadiotis, 2015; Treumann and Baumjohann, 2014, 2016].

While there is still some hesitation by some in the community, the fact that nearly all particle velocity distributions observed to date in collisionless space plasmas can be modeled by kappa distributions more accurately than Maxwellians is strong support for Tsallis statistics.

Finally, (iii) can the argument I sketched above be found in the literature? I had a quick look at some dissenting opinions about Tsallis entropy, but surprisingly I didn't immediately see the point about mutual information and the non-extensivity of Gibbs-Shannon entropy.

The long-range interactions and the collisionless nature of some plasmas causes these systems to continually be in a state of non-equilibrium. This type of system requires a non-extensive formalism, as Leubner [2002] states:

Any extensive formalism fails whenever a physical system includes long-range forces or long-range memory. In particular, this situation is usually found in astrophysical environments and plasma physics where, for example, the range of interactions is comparable to the size of the system considered. A generalized entropy is required to possess the usual properties of positivity, equiprobability, concavity and irreversibility but suitably extending the standard additivity to nonextensivity...

References

  • Feldman, W.C., et al., "Electron Velocity Distributions Near the Earth's Bow Shock," Journal of Geophysical Research 88(A1), pp. 96--110, doi:10.1029/JA088iA01p00096, 1983.
  • Leubner, M.P. "A Nonextensive Entropy Approach to Kappa-Distributions," Astrophysics and Space Science 282(3), pp. 573--579, doi:10.1023/A:1020990413487, 2002.
  • Livadiotis, G. "Introduction to special section on Origins and Properties of Kappa Distributions: Statistical Background and Properties of Kappa Distributions in Space Plasmas," Journal of Geophysical Research: Space Physics 120(3), pp. 1607--1619, doi:10.1002/2014JA020825, 2015.
  • Maksimovic, M., et al., "Ulysses electron distributions fitted with Kappa functions," Geophysical Research Letters 24(9), pp. 1151--1154, doi:10.1029/97GL00992, 1997.
  • Thomsen, M.F., et al., "Stability of Electron Distributions Within the Earth's Bow Shock," Journal of Geophysical Research 88(A4), pp. 3035--3045, doi:10.1029/JA088iA04p03035, 1983.
  • Treumann, R.A. and W. Baumjohann "Beyond Gibbs-Boltzmann-Shannon: general entropies—the Gibbs-Lorentzian example," Frontiers in Physics 2(49), pp. 1--5, doi:10.3389/fphy.2014.00049, 2014.
  • Treumann, R.A. and W. Baumjohann "Generalised partition functions: inferences on phase space distributions," Annales Geophysicae 34(6), pp. 557--564, doi:10.5194/angeo-34-557-2016, 2016.
  • Vasyliunas, V.M. "A survey of low-energy electrons in the evening sector of the magnetosphere with OGO 1 and OGO 3," Journal of Geophysical Research 73(9), pp. 2839--2884, doi:10.1029/JA073i009p02839, 1968.
Related Question