I'll state one version of the theorem, valid for classical systems. I'll not give the most general framework, as things become messy, but this should still give you an idea of how general the result is.
We need the following ingredients:
- Spins: to each vertex of the lattice $\mathbb{Z}^2$, we attach a spin $\phi_x$ taking values in some compact topological space $S$.
- Symmetry group: a compact, connected Lie group $G$ acting on $S$.
- Interaction: a piecewise-continuous function $U:S\times S\to\mathbb{R}$, invariant under the action of $G$: $U(g\phi_x,g\phi_y) = U(\phi_x,\phi_y)$, for each $g\in G$.
- Coupling constants: a collection $(J_x)_{x\in\mathbb{Z}^d}$ of nonnegative real numbers, such that $\sum_{x\neq 0} J_x<\infty$.
We then consider the formal Hamiltonian
$$
H(\phi) = \sum_{x\neq y\in\mathbb{Z}^2} J_{y-x} U(\phi_x,\phi_y).
$$
There is no loss of generality in assuming that $\sum_{x\neq 0} J_x = 1$ (since one can always rescale $U$). With this normalization, we can consider the random walk $X$ on $\mathbb{Z}^2$ with transition probabilities from $x$ to $y$ given by $J_{y-x}$.
The statement then takes the following form: Under the above assumptions, all infinite-volume Gibbs measures associated to the formal Hamiltonian $H$ are invariant under the action of $G$, provided that the random walk $X$ is recurrent.
As an example, consider the case of the $O(N)$ model. In that case, $S=\mathbb{S}^{N-1}$ is the $(N-1)$-sphere, $G=O(N)$ is the group of rotations of $\mathbb{S}^{N-1}$, $U(\phi_x,\phi_y) = -\phi_x \cdot \phi_y$ is minus the scalar product of the two unit vectors. The above result shows that all infinite-volume Gibbs measures associated to the $O(N)$-model are rotation invariant (which implies in particular that there cannot be spontaneous magnetization) as soon as the random walk $X$ is recurrent. Interestingly, it is known, in that case, that there is spontaneous magnetization (and, thus, spontaneous breaking of the rotation symmetry) at low temperatures, as soon as the random walk $X$ is transient. If you prefer a more explicit criterion, restrict your attention to the case $J_x \propto |x|^{-\alpha}$. Then the previous discussion implies that there is spontaneous symmetry breaking at low temperatures in the $O(N)$-model if and only if $\alpha<4$.
[EDIT:] Here's an (very incomplete) list of references for some of the points mentioned above.
Version of the theorem given above:
2D Models of Statistical Physics with Continuous Symmetry: The Case of Singular Interactions, D. Ioffe, S. Shlosman and Y. Velenik, Commun. Math. Phys. 226, 433-454 (2002). arXiv:math/0110127
(The result is actually a bit more general than the one stated above.)
Proof for general graphs (under the assumption that the associated random walk is recurrent and for twice continuously differentiable interaction $U$):
Recurrent random walks and the absence of continuous symmetry breaking on graphs, F. Merkl and H. Wagner, J. Statist. Phys. 75 (1994), no. 1-2, 153–165.
(Again, their results are substantially more general than that: they treat not necessarily ferromagnetic couplings, quantum systems, etc.)
Proof that $O(N)$ models on $\mathbb{Z}^d$ display spontaneous magnetization at low temperatures as soon as the associated random walk is transient:
The Mermin-Wagner phenomenon and cluster properties of one- and two-dimensional systems, C. A. Bonato, J. F. Perez, A. Klein, J. Statist. Phys. 29 (1982), no. 2, 159–175.
You can also check Theorem (20.15) in
Gibbs measures and phase transitions, H.-O. Georgii, de Gruyter Studies in Mathematics, 9. Walter de Gruyter & Co., Berlin, 1988.
There are of course many other relevant references. Please check the bibliography given in these references.
All the answers here and in the other question do not address the important difference between superconductivity and superfluidity: namely that the Nambu-Goldstone modes in superconductors are not gapless. The latter is an assumption for the validity of the Mermin-Wagner-Hohenberg-Coleman theorem, and therefore it does not apply.
The question of whether superconductivity can exist in a strictly 2D system turns out to be very interesting. Let's go in steps:
The Mermin-Wagner-Hohenberg-Coleman theorem precludes true long-range order in 2 dimensions (at finite temperature) or 1 dimension (at zero temperature). The reason, as is clear from Coleman's proof, is that that the fluctuations of linearly dispersing scalar modes, c.q. Nambu-Goldstone modes, in 2+0 or 1+1D are so violent as to preclude their existence altogether. This is due to an infrared divergence, so this applies to long wavelengths/large systems.
However, the Nambu-Goldstone modes in a superconductor are gapped by the Anderson-Higgs mechanism (due to the coupling to gauge fields, the electromagnetic field). Therefore, there is no infrared divergence (the $k^2$ term in the denominator is replaced by $k^2 + (\hbar\omega_{\rm{p}}/c)^2$, with $\Delta$ the superconducting gap with $\omega_{\rm{p}}$ the plasma frequency). The theorem does not apply.
So there seems to be no obstruction to superconductivity in any low dimension. But this does not take into account the topological defects (vortices). In 2D superfluids, there is the BKT phase transition between a low-temperature, quasi-long-range ordered phase, where vortex pairs are bound, and a high-temperature, disordered phase, where vortex pairs are unbound. The transition temperature is set by the balance between the energy cost of a vortex pair (which grows logarithmically with system size) and the entropy gain of having thermally excited pairs (which also grows logarithmically with system size). But in superconductors, the size of a vortex is capped by the inverse of the superconducting energy gap $\Delta$. Conversely, the entropy gain is unaffected. Therefore the argument leading to the BKT phase transition does not apply, and the transition temperature is pushed to zero as the system size grows. In an infinite volume, vortices are unbound at any temperature.
So the conclusion seems to be that, despite the non-applicability of the MWHC-theorem, superconductivity cannot exist in an infinite, strictly 2D system.
In real life, however, there are many examples of quasi-2D systems (even monolayers) that exhibit all the signs of superconductivity, including dissipationless current and a form of Meissner effect. The reason is that the electromagnetic field is not restricted to 2D. The field lines permeate out of the 2D layer. This causes the in-plane penetration depth $\lambda_{\rm{2D}}$ to become very large. To lowest order, one gets:
$$
\lambda_{\rm{2D}} = \frac{\lambda_{\rm{L}}}{d}
$$
where $\lambda_{\rm{L}}$ is the usual London penetration depth depending on the superconducting order parameter, and $d$ is the thickness of the system. For a very thin sample, the penetration depth diverges. In other words, it is a very strong type-II superconductor.
If the penetration depth becomes larger than the linear size of the system, it is effectively a neutral superfluid as far as transverse electromagnetic effects are concerned. This also implies that the vortex energy again depends logarithmically on the system size, and the BKT criterion applies. In fact, the BKT transition has been observed in many quasi-2D superconductors.
To my knowledge, none of this has been investigated really well, or at all. It would make for an interesting research project.
TL;DR The Mermin-Wagner-Hohenberg-Coleman theorem does not apply. However, due to vortex unbinding, strictly 2D superconductors would not exist. In reality, the electromagnetic field is always 3D and turns a 2D superconductor into a neutral superfluid with BKT transition.
Best Answer
First, I will translate the relevant passages in your paper from mathematese.
The argument in your reference
You are studying an X-Y model with the constraint that neighboring spins have to always be within a certain angle of each other. You define the collection of statistical-mechanics Gibbs distributions using a given boundary condition at infinity, as the boundaries get further and further away. Then you note that if the field at the boundary makes the spin turn around from top to bottom the maximum possible amount, then the spins are locked in place--- they can't move, because they need to make a certain winding, and they unless they are at the maximum possible angle, they can't make the winding.
Using these boundary conditions, there is no free energy, there is no thermodynamics, there is no spin-wave limit, and the Mermin Wagner theorem fails.
You also claim that the theorem fails with a translation invariant measure, which is just given by averaging the same thing over different centers. You attempt to make the thing more physical by allowing the boundary condition to fluctuate around the mean by a little bit $\delta$. But in order to keep the boundary winding condition tight, as the size of the box $N$ goes to infinity, $\delta$ must shrink as $1\over N$, and the resulting Free energy of your configuration will always be subextensive in the infinite system limit. If $\delta$ does not shrink, the configurations will always randomize their angles, as the Mermin-Wagner theorem says.
The failures of the Mermin-Wagner theorem are all coming from this physically impossible boundary situation, not really from the singular potentials. By forcing the number of allowed configurations to be exactly 1 for all intents and purposes, you are creating a situation where each different average value of the angle has a completely disjoint representative in the thermodynamic limit. This makes the energy as a function of the average angle discontinuous (actually, the energy is infinite except for near one configuration), and makes it impossible to set up spin waves.
This type of argument has a 1d analog, where the analog to Mermin-Wagner is much easier to prove.
1-dimensional mechanical analogy
To see that this result isn't Mermin-Wagner's fault, consider the much easier one-dimensional theorem--- there can be no 1d solid (long range translational order). If you make a potential between points which is infinite at a certain distance D, you can break this theorem too.
What you do is you impose the condition that there are N particles, and the N-th particle is at a distance ND from the first. Then the particles are forced to be right on the edge of the infinite well, and you get the same violation: you form a 1d crystal only by imposing boundary conditions on a translation invariant potential.
The argument in 1d that there can be no crystal order comes from noting that a local defect will shift the average position arbitrarily far out, so as you add more defects, you will wash out the positional order.
Mermin-Wagner is not affected
The standard arguments for the Mermin-Wagner theorem do not need modification. They are assuming that there is an actual thermoodynamic system, with a nonzero extensive free energy, an entropy proportional to the volume, and this is violated by your example. The case of exactly zero temperature is also somewhat analogous--- it has no extensive entropy, and at exactly zero temperature, you do break the symmetry.
If you have an extensive entropy, there is a marvelous overlap property which is central to how physicists demonstrate the smoothness of the macroscopic free-energy. The Gibbs distribution at two angles infinitesimally separated sum over almost the same exact configurations (in the sense that for a small enough angle, you can't tell locally that it changed, because the local fluctuations swamp the average, so the local configurations don't notice)
The enormous, nearly complete, overlap between the configurations at neighboring angles demonstrates that the thermodynamic average potentials are much much smoother than the possibly singular potentials that enter into the microscopic description. You always get a quadratic spin-wave density, including in the case of the model you mention, whenever you have an extensive free energy.
Once you have a quadratic spin-wave energy, the Mermin Wagner theorem follows.
Quick answer
the Gibbs distributions for orientation $\theta$ and the Gibbs distributions for orientation $\theta'$ always include locally overlapping configurations as $\theta$ approaches $\theta'$. This assumption fails in your example, because even an infinitesimal change in angle for the boundary condition changes the configurations completely, because they do not have extensive entropy, and are locked to within a $\delta$, shrinking with system size, of an unphysically constrained configuration.