1.
The key point that you're missing is that spontaneous symmetry breaking, or indeed the notion of phase transitions in general, only works for systems with local interactions. A phase transition is defined to be a point in Hamiltonian parameter space at which the free-energy density becomes non-analytic in the infinite-size limit. This definition obviously presupposes the existence of a well-defined infinite-system limit of the free energy density. But for translationally invariant lattice systems like the Ising model, the free energy density only approaches a constant value as $N \to \infty$ if $\sum_j J_{ij}$ converges absolutely, which roughly means that $|J_{ij}|$ has to fall off faster than $1/r^d$, where $d$ is the number of dimensions. In other words, the couplings must be reasonably local.
(Experts might object that disordered systems with nonlocal all-to-all couplings, like the Sherrington-Kirkpatrick or SYK models, still have replica-symmetry-breaking phase transitions. But that's actually only true if you rescale the coupling constants as a power of the total system size, which is not a very physical thing to do. If you don't do this, then the phase transition goes away, and indeed the $N \to \infty$ limit becomes ill-defined. Real systems are never truly all-to-all coupled - in practice there's some maximum distance at which the couplings go away, and all-to-all-coupled models are just a convenient approximation.)
Any putative explanation of spontaneous symmetry breaking that doesn't explicitly use locality is at best seriously incomplete. Decoherence is too complicated to explain here, but a key assumption is that the interactions are local in space, which picks out the position basis as a naturally favored pointer basis, so that position near-eigenstates are more natural than, say, momentum near-eigenstates.
- and 3.
The locality of the system, and specifically the assumption that perturbations are all local, gives us a notion of the "distance" between two states which is more useful than mere othogonality. As you point out, orthogonality/inner products alone can't distingush between two states that only differ by a single spin, and two states that differ by all their spins, even though the latter pair is clearly in some sense "more different" than the former.
You are of course correct that $\langle i | A | j \rangle = 0$ for any two distinct eigenstates of any Hermitian operator, not just the Hamiltonian. But that simple matrix element isn't the right definition of "the tunneling amplitude". As far as I know, the actual definition is a little fuzzy and the concept is more of an art than a science, but here are two possible conceptualizaions:
a) You can think of the symmetry-breaking term as a perturbation and decompose the Hamiltonian as $H = H_0 + \Delta H$, where $H_0$ respects the symmetry and $\Delta H$ breaks it. Then perturbation theory tells us that all the perturbative corrections can be expressed in terms of the matrix elements $\langle i_0 | \Delta H | j_0 \rangle$ where $\langle i_0|$ and $|j_0\rangle$ are the eigenstates of the unperturbed Hamiltonian $H_0$, not the exact Hamiltonian. These matrix elements are generically nonzero.
b) I don't like perturbation theory, so I prefer to think of it by analogy with Monte Carlo. The environment is constantly try to act on the system with random little local symmetry-breaking perturbations. You can think of it as if $h = 0$ in the full Hamiltonian, but $h_i \sigma_i^x$ terms randomly appear momentarily at individual sites $i$ (or similar terms on small local clusters of sites). These are like Monte Carlo candidate spin flips, and at low temperature they usually only get accepted if they lower the system's total energy. For a small system that starts in the all-$\uparrow$ state, you might get lucky and accept enough flips to eventually take you into a majority-$\downarrow$ state, at which point you'll then probably proceed to all-$\downarrow$ - even though each of those first few individual flips were unlikely. But in order to flip more than half the system, you initially need to get lucky many (independent) times in a row, and the odds of that happening decrease exponentially with system size. The "tunneling amplitude" is basically the probability of this happening after many Monte Carlo sweeps, and it indeed decreases exponentially with system size. For a small system, you'll eventually flip over to the other ground state, although it'll take a really long time. For a large system, it'll take a really long time, and for an infinite system it'll never fully get there.
If that analogy's too classical for your taste, you can instead think of the space of random quantum circuits, with circuits weighted according to a cost function that depends on the Hamiltonian matrix elements, and the "tunneling amplitude" between two quantum states is like the total weight of all the random circuits that take one state to the other.
4.
You're right that any finite value of $h$ breaks the symmetry. For any system, even an infinite one, you have that $m(h) \neq 0$ if $h > 0$. But what about the limit $h \to 0^+$? One definition of SSB is the failure of the limits $h \to 0^+$ and $N \to \infty$ to commute. In the SSB phase, after you take $N \to \infty$, you have that $m(h)$ has a jump discontinuity at $h = 0$, so that $m(0) = 0$ but $\lim \limits_{h \to 0^+} m(h) > 0$. That what we mean when we say that an infinitesimal perturbation $h$ breaks the symmetry.
As correctly stated in the following answer by flippiefanus , dynamical symmetry breaking is identical to spontaneous symmetry breaking except that in the case of dynamical symmetry breaking a composite noninvariant field operator acquires a vacuum expectation value while in the spontaneous symmetry breaking case an elementary noninvariant field operator acquires a vacuum expectation value. Please see, for example, the following review by Higashijima (at the bottom of page 2).
Apart from this difference, these two cases are completely identical: In both cases, the Goldstone theorem applies; the rules for the number of Nambu-Goldstone bosons and their representations are the same.
Both cases above refer to global symmetry breaking.
The Higgs mechanism differs from both cases. First, although many textbooks introduce the Higgs mechanism in classical theory as spontaneous symmetry breaking (of the global symmetry) in systems with local symmetry, this is not the only valid description. Landsman describes the two approaches in the case of the Abelian Higgs model:
$$\mathcal{L} = -\frac{1}{4} F_A^2 + \frac{1}{2} D_{\mu}^A\phi D_{\mu A}\phi – V(|\phi|)$$
By performing a redefinition of the fields:
$$\begin{pmatrix}\phi_1 \\\phi_1\end{pmatrix} = e^{i \theta \sigma_x}\begin{pmatrix}\rho \\0\end{pmatrix}$$
$$A_{\mu} = B_{\mu} + \partial_{\mu} \theta$$
By substituting this parametrization into the Lagrangian, the $\theta$ dependence vanishes completely, and we are left with:
$$\mathcal{L} = -\frac{1}{4} F_B^2 + \frac{1}{2} \partial_{\mu}\rho \partial_{\mu}\rho +\frac{1}{2}\rho^2 B_{\mu}B^{\mu} – V(\rho)$$
This Lagrangian (which is gauge fixed as both $\rho$ and $B$ are invariant under the gauge transformation) describes a real scalar field and a massive gauge boson in the case when the scalar field acquires a vacuum expectation value.
Landsman also describes the conventional picture where the Nambu-Goldstone boson gets eaten by the gauge field. The question, which picture is the right one in quantum theory is not settled. The difference is that in the conventional picture, the global rigid symmetry gets spontaneously broken, while in the second picture it does not.
The conventional picture seemingly contradicts Elitzur's theorem and the fact that local gauge symmetry cannot be broken. This is the reason why some authors prefer the second picture over the conventional picture, please see the following lecture notes, on the grounds of Elitzur's theorem. However, as Landsman shows on pages 426-428, it is possible to still implement the first picture on a gauge fixed Lagrangian for which Elitzur's theorem is not valid. The only loophole remaining in the conventional picture is that gauge fixing does not get rid of all gauge redundancy.
Best Answer
Consider the Kitaev's toric code Hamiltonian (wiki, some helpful notes) on a 2-torus. This Hamiltonian has symmetry operators that live on non-trivial loops around the torus. This system is 4-fold degenerate and one can show that the symmetry operators have a nontrivial representation in the ground space, satisfying your definition for spontaneous symmetry breaking (SSB). But conventionally, this system is not said to be SSB but an example of topological order.
SSB and topological order are similar ideas as motivated by your definition, but the crucial difference is that states in the ground space of topologically ordered systems are locally indistinguishable (all local operators have the same expectation value), whereas this is not true for SSB systems.
A modern interpretation of topological order is that it can be thought of as SSB of higher-form symmetries (1-form in the case of toric code), and conventional SSB can be thought of as SSB of 0-form symmetries. In this generalized symmetry paradigm, your definition for SSB would be correct. In the toric code example, $G$ would be the set of loop-like symmetries which get spontaneously broken in the topologically ordered phase. Check out this review article for more on that.