What you've stumbled upon is called the "Gibbs paradox", and the resolution is to divide the phase space for entropy calculations in statistical mechanics by the identical particle factor, which reduces the number of configurations.
Since the temperature is unchanged in the process, the momentum distribution of the atoms is unimportant, it is the same before and after, and the entropy is entirely spatial, as you realized. The volume of configuration space for the left part is:
${V_1^N \over N!}$
and for the right part is:
${V_2^N\over N!}$
And the total volume of the 2N particle configuration space is:
$(V_1V_2)^N\over (N!)^2$
When you lift the barrier, you get the spatial volume of configuration space
$(V_1 + V_2)^{2N} \over (2N)!$
When $V_1$ and $V_2$ are equal, you naively would expect zero entropy gain. But you do gain a tiny little bit of entropy by removing the wall. Before you removed the wall, the number of particles on the left and on the right were exactly equal, now they can fluctuate a little bit. But this is a negligible amount of extra entropy in the thermodynamic limit, as you can see:
${(2V)^{2N}\over (2N)!} = {2^{2N}(N!)^2\over (2N)!}{V^{2N}\over (N!)^2}$
So that the extra entropy from lifting the barrier is equal to:
$ \log ({(2N)!\over 2^{2N}(N!)^2})$
You might recognize the thing inside the log, it's the probability that a symmetric +/-1 random walk returns to the origin after N steps, i.e. the biggest term of the Pascal triangle at stage 2N when normalized by the sum of all the terms of Pascal's triangle at that stage. From the Brownian motion identity or equivalently, directly from Stirling's formula), you can estimate its size as ${1\over \sqrt{2\pi N}}$, so that the logarithm goes as log(N), it is sub-extensive, and vanishes for large numbers.
The entropy change in the general case is then exactly given by the logarithm of the ratio of the two configuration space volumes before and after:
$e^{\Delta S} = { V_1^N V_2^N \over (N!)^2 } { (2N)! \over (V_1 + V_2)^{2N}} = { V_1^N V_2^N \over ({V_1 + V_2 \over 2})^{2N}} {(2N)!\over 2^{2N}(N!)^2}$
Ignoring the thermodynamically negligible last factor, the macroscopic change in entropy, the part proprtional to N, is:
$\Delta S = N\log({4 V_1 V_2 \over (V_1 + V_2)^2})$
up to a sign, it is as you calculated.
Additional comments
You might think that it is weird to gain a little bit of entropy just from the fact that before you lift the wall you knew that the particle numbers were exactly N, even if that entropy is subextensive. Wouldn't that mean that when you lower the wall, you reduce the entropy a tiny subextensive amount, by preventing mixing of the right and left half? Even if the entropy decrease is tiny, it still violates the second law.
There is no entropy decrease, because when you lower the barrier, you don't know how many molecules are on the left and how many are on the right. If you add the entropy of ignorance to the entropy of the lowered wall system, it exactly removes the subextensive entropy loss. If you try to find out how many molecules are on the right vs how many are on the left, you produce more entropy in the process of learning the answer than you gain from the knowledge.
Best Answer
The entropy of $N$ molecules of an ideal gas in a volume $V$ at temperature $T$ can be expressed as:
$$S(N, V, T) = N k\log\left(\frac{V}{V_0}\right) + C_v\log\left(\frac{T}{T_0}\right) + S(N, V_0,T_0)$$
Here $V_0$ and $T_0$ define arbitrary standard conditions at which the entropy is known, and $C_v$ is the total heat capacity at constant volume. To derive this formula, you can consider the change in entropy from the standard conditions to the final state using an isothermal process at constant pressure where heat is added to the system, thus yields the first term. After that we can change the temperature from $T_0$ to $T$ by adding heat to the system at constant volume, the entropy change due to that process is given by the second term.
The initial entropy of the system can thus be expressed as:
$$S_{\text{initial}} = S(N,V_1,T_i) + S(N,V_2,T_i) = N k\log\left(\frac{V_1V_2}{V_0^2}\right) + 2C_v\log\left(\frac{T_i}{T_0}\right) + K$$
where $K$ is a constant (for problems where the total number of molecules in the system does not change). The final state will be a state where the molecules are (or can be considered to be) in a volume of $V_1 + V_2$ at some temperature $T_f$. If no work can be extracted anymore the gases in the two boxes must be in thermal equilibrium with each other and then doesn't matter whether or not there is a separation between the gases. The final entropy is thus given by:
$$S_{\text{final}} = S(2N,V_1+V_2,T_f) = 2N k\log\left(\frac{V_1+V_2}{V_0}\right) + 2C_v\log\left(\frac{T_f}{T_0}\right) + K$$
Then for any process involving only the two boxes, $S_{\text{final}}\geq S_{\text{initial}}$. The maximum amount of work we can extract from the system is obtain in the reversible case where the entropy stays the same. We can see this by considering two processes, one where the entropy increases and one where it stays the same. Then we can go from the latter to the former by dumping energy extracted in the form of work as heat into the system at constant volume of $V_1+V_2$ until we reach the same entropy as the former system (and as a result also the final temperature of the latter system, as volume, entropy and number of molecules completely determine the thermodynamic state of the system). Since we've then thrown away work to arrive at the former end state, with entropy increase you're alway worse off then when the entropy stays the same.
To find the maximum amount of work, we thus need to equate $S_{\text{final}}$ to $S_{\text{initial}}$, we can then solve for $T_{f}$, the drop in the internal energy is then the maximum amount of work extracted from the system (note that no heat can have been added or extracted from the system, because the total entropy has stayed the same, therefore the entire internal energy change is due to work). Solving for $T_f$ yields:
$$T_f = T_i \left(\sqrt{\frac{V_2}{V_1}}+\sqrt{\frac{V_1}{V_2}}\right)^{-\frac{N k}{C_v}} = T_i \left(\sqrt{\frac{V_2}{V_1}}+\sqrt{\frac{V_1}{V_2}}\right)^{1-\gamma}$$
where we've used that $C_V = \dfrac{f}{2}N k$ and $\gamma = \dfrac{f+2}{f}$ where $f$ is the effective number of degrees of freedom per molecule.
The total amount of work $W$ that can be extracted is therefore equal to:
$$W = 2 C_V (T_i - T_f) = \frac{2N kT_i}{\gamma - 1}\left[1-\left(\sqrt{\frac{V_2}{V_1}}+\sqrt{\frac{V_1}{V_2}}\right)^{1-\gamma}\right] $$