What you've stumbled upon is called the "Gibbs paradox", and the resolution is to divide the phase space for entropy calculations in statistical mechanics by the identical particle factor, which reduces the number of configurations.
Since the temperature is unchanged in the process, the momentum distribution of the atoms is unimportant, it is the same before and after, and the entropy is entirely spatial, as you realized. The volume of configuration space for the left part is:
${V_1^N \over N!}$
and for the right part is:
${V_2^N\over N!}$
And the total volume of the 2N particle configuration space is:
$(V_1V_2)^N\over (N!)^2$
When you lift the barrier, you get the spatial volume of configuration space
$(V_1 + V_2)^{2N} \over (2N)!$
When $V_1$ and $V_2$ are equal, you naively would expect zero entropy gain. But you do gain a tiny little bit of entropy by removing the wall. Before you removed the wall, the number of particles on the left and on the right were exactly equal, now they can fluctuate a little bit. But this is a negligible amount of extra entropy in the thermodynamic limit, as you can see:
${(2V)^{2N}\over (2N)!} = {2^{2N}(N!)^2\over (2N)!}{V^{2N}\over (N!)^2}$
So that the extra entropy from lifting the barrier is equal to:
$ \log ({(2N)!\over 2^{2N}(N!)^2})$
You might recognize the thing inside the log, it's the probability that a symmetric +/-1 random walk returns to the origin after N steps, i.e. the biggest term of the Pascal triangle at stage 2N when normalized by the sum of all the terms of Pascal's triangle at that stage. From the Brownian motion identity or equivalently, directly from Stirling's formula), you can estimate its size as ${1\over \sqrt{2\pi N}}$, so that the logarithm goes as log(N), it is sub-extensive, and vanishes for large numbers.
The entropy change in the general case is then exactly given by the logarithm of the ratio of the two configuration space volumes before and after:
$e^{\Delta S} = { V_1^N V_2^N \over (N!)^2 } { (2N)! \over (V_1 + V_2)^{2N}} = { V_1^N V_2^N \over ({V_1 + V_2 \over 2})^{2N}} {(2N)!\over 2^{2N}(N!)^2}$
Ignoring the thermodynamically negligible last factor, the macroscopic change in entropy, the part proprtional to N, is:
$\Delta S = N\log({4 V_1 V_2 \over (V_1 + V_2)^2})$
up to a sign, it is as you calculated.
Additional comments
You might think that it is weird to gain a little bit of entropy just from the fact that before you lift the wall you knew that the particle numbers were exactly N, even if that entropy is subextensive. Wouldn't that mean that when you lower the wall, you reduce the entropy a tiny subextensive amount, by preventing mixing of the right and left half? Even if the entropy decrease is tiny, it still violates the second law.
There is no entropy decrease, because when you lower the barrier, you don't know how many molecules are on the left and how many are on the right. If you add the entropy of ignorance to the entropy of the lowered wall system, it exactly removes the subextensive entropy loss. If you try to find out how many molecules are on the right vs how many are on the left, you produce more entropy in the process of learning the answer than you gain from the knowledge.
(i) Is the above characterisation of the motivation for Tsallis entropy correct, or are there cases where the parts of a system can be statistically independent and yet we still need a non-extensive entropy?
The one example I can think of that fits this description is a collisionless plasma (well, at least weakly collisional), like the solar wind.
Over scales larger than the Debye length, the system behaves in a collective manner but the collisionless nature of the gas keeps it from reaching equilibrium. Further, even though electromagnetic fields produce long-range interactions, the "parts" of the system (e.g., Debye spheres) can still be statistically independent. This allows a collisionless plasma to behave according to a non-extensive kinetic theory.
(ii) What is the current consensus on the validity of Tsallis entropy-based approaches to statistical mechanics? I know that it's been the subject of debate in the past, but Wikipedia seems to imply that this is now settled and the idea is now widely accepted. I'd like to know how true this is.
I think the validity of Tsallis entropy is generally accepted, at least in space plasma physics [e.g., see Livadiotis, 2015]. The support for a non-Maxwell-Boltzmann theory arose because of the continual observation of velocity distributions (e.g., Maxwellian) that had power-law tails and the lack of observations of Maxwellians. Initial attempts to model these distributions included superpositions of modified Lorentzian distributions (e.g., similar to Cauchy distributions) with Maxwellians [e.g., Feldman et al., 1983; Thomsen et al., 1983]. Later studies [e.g., Maksimovic et al., 1997] resurrected an old form called a kappa distribution, which was originally derived by Vasyliunas [1968]. Eventually, Leubner [2002] showed the connection between the kappa distribution and the Tsallis distribution when $\kappa = -1/\left( q - 1 \right)$, where $q$ is the entropic parameter from Tsallis statistics (Note that the kappa distribution is a member of the modified Lorentzian distributions).
More recently, a great deal of work has started to solidify the relationship between kappa distributions and Tsallis statistics and fundamental thermodynamics. In recent years a lot of work on this topic has been published that attempts to merge the more traditional statistical mechanics with non-extensive statistical mechanics [e.g., Livadiotis, 2015; Treumann and Baumjohann, 2014, 2016].
While there is still some hesitation by some in the community, the fact that nearly all particle velocity distributions observed to date in collisionless space plasmas can be modeled by kappa distributions more accurately than Maxwellians is strong support for Tsallis statistics.
Finally, (iii) can the argument I sketched above be found in the literature? I had a quick look at some dissenting opinions about Tsallis entropy, but surprisingly I didn't immediately see the point about mutual information and the non-extensivity of Gibbs-Shannon entropy.
The long-range interactions and the collisionless nature of some plasmas causes these systems to continually be in a state of non-equilibrium. This type of system requires a non-extensive formalism, as Leubner [2002] states:
Any extensive formalism fails whenever a physical system includes long-range forces or long-range memory. In particular, this situation is usually found in astrophysical environments and plasma physics where, for example, the range of interactions is comparable to the size of the system considered. A generalized entropy is required to possess the usual properties of positivity, equiprobability, concavity and irreversibility but suitably extending the standard additivity to nonextensivity...
References
- Feldman, W.C., et al., "Electron Velocity Distributions Near the Earth's Bow Shock," Journal of Geophysical Research 88(A1), pp. 96--110, doi:10.1029/JA088iA01p00096, 1983.
- Leubner, M.P. "A Nonextensive Entropy Approach to Kappa-Distributions," Astrophysics and Space Science 282(3), pp. 573--579, doi:10.1023/A:1020990413487, 2002.
- Livadiotis, G. "Introduction to special section on Origins and Properties of Kappa Distributions: Statistical Background and Properties of Kappa Distributions in Space Plasmas," Journal of Geophysical Research: Space Physics 120(3), pp. 1607--1619, doi:10.1002/2014JA020825, 2015.
- Maksimovic, M., et al., "Ulysses electron distributions fitted with Kappa functions," Geophysical Research Letters 24(9), pp. 1151--1154, doi:10.1029/97GL00992, 1997.
- Thomsen, M.F., et al., "Stability of Electron Distributions Within the Earth's Bow Shock," Journal of Geophysical Research 88(A4), pp. 3035--3045, doi:10.1029/JA088iA04p03035, 1983.
- Treumann, R.A. and W. Baumjohann "Beyond Gibbs-Boltzmann-Shannon: general entropies—the Gibbs-Lorentzian example," Frontiers in Physics 2(49), pp. 1--5, doi:10.3389/fphy.2014.00049, 2014.
- Treumann, R.A. and W. Baumjohann "Generalised partition functions: inferences on phase space distributions," Annales Geophysicae 34(6), pp. 557--564, doi:10.5194/angeo-34-557-2016, 2016.
- Vasyliunas, V.M. "A survey of low-energy electrons in the evening sector of the magnetosphere with OGO 1 and OGO 3," Journal of Geophysical Research 73(9), pp. 2839--2884, doi:10.1029/JA073i009p02839, 1968.
Best Answer
The problem you are having is that you are not applying the formula $S = {\sum_{\ i}}\ p_i \log p_i$ correctly. (Here I will take $k_B=1$.) In the formula, the $i$ is supposed to index over the set of all microscopic states. I will first say what you did wrong, then I will say how to do the problem correctly.
To summarize the problem, you have a volume of gas with left and right parts separated by a partition so that the left part occupies a fraction $x$ of the volume. Then $V_L = x V$ where $V$ is the total volume and $V_L$ is the volume of the left part. Also the densities are the same so that $N_L = x N$ where $N_L$ is the number of particles in the left part and $N$ is the total number of particles. The volume and number of particles on the right side is $V_R$ and $N_R$.
You said the probability of there being of there being $n$ particles on the left is $p(n)=\binom{N}{n}x^n(1-x)^{N-n}$. This is a perfectly true statement. The next thing you did was to plug this into the formula $S = {\sum_{\ i}}\ p_i \log p_i$. The result would give you the right entropy if there is exactly one microscopic state where $n$ particles are on the left and $N-n$ are on the right. However, this is not the case. There are many microscopic states like this.
Now lets see how to do the problem correctly. For the systems considered here, we assume each of the $N_s$ microscopic states is equally likely. Then the probability of a given microscopic state is $1/N_s$. Thus $p_i = 1/N_s$ Our formula for the entropy becomes $S=\sum_{\ i} \frac{1}{N_s} \log{N_s} = N_s * 1/N_s \log{N_s} = \log(N_s).$
The number of microscopic states is equal to the volume of available phase space. Since there is no spatially varying potential, the total volume of available phase space can be written as a product $V_C V_P$ of the available volume in configuration space times the available volume in momentum space. Then $S=\log(N_s) = \log(V_C) + \log(V_P)$. Now for our problem, the volume available in momentum space does not change when the wall is removed, therefore is contribution to the entropy does not change. Therefore we must only consider the change $\log(V_C)$.
What is $V_C$ initially? Well, $V_C$ for a gas of $N$ indistinguishable particles in a volume $V$ is $V^N/N!$. The N! arises because the particles are indistinguishable so two states that are different only by permuting the particles should not be counted separately. Now initially we have two independent systems. The left system has a configurational volume $V_L^{N_L}/N_L!$, and the right system has a configurational volume $V_R^{N_R}/N_R!$. The total configurational volume is the product of these two, so that $V_C=\frac{V_L^{N_L}}{N_L!}\frac{V_R^{N_R}}{N_R!}$. Then the inital configurational entropy is $S_i = \log(V_C) = \log(\frac{V_L^{N_L}}{N_L!}) + \log(\frac{V_R^{N_R}}{N_R!}) =N_L\log(\frac{V_L}{N_L!}) + N_R\log(\frac{V_R}{N_R!}) $.
After the barrier has been removed, each particles has a volume $V$ available to it instead of just $V_R$ or $V_L$, thus the final configurational entropy is $S_f = N_L\log(\frac{V}{N_L!}) + N_R\log(\frac{V}{N_R!}) $.
The change in entropy is $S_f - S_i = N_L\log(\frac{V}{N_L!}\frac{N_L!}{V_L}) + N_R\log(\frac{V}{N_R!}\frac{N_R!}{V_R}) \\ =N_L\log(\frac{V}{V_L}) + N_R\log(\frac{V}{V_R}) \\ = N_L\log(1/x) + N_R\log(1/(1-x))\\ =xN\log(1/x) + (1-x)N\log(1/(1-x))\\ =N(x\log(1/x) + (1-x)\log(1/(1-x))\\ =-N(x\log(x) + (1-x)\log(1-x).$
This is the answer you wanted.