The papers you should read for this question are Vaidman, Lev. "Torque and force on a magnetic dipole." Am. J. Phys 58.10 (1990): 978-983 (Paywall-free version) and Haus, H. A., and P. Penfield. "Force on a current loop." Physics Letters A 26.9 (1968): 412-413. (References 8 and 9 in Vaidman's paper are also worth reading for more context.) The gist of the answer is this: if and only if there are no magnetic monopoles, magnetic dipoles and current loops are equivalent.
As you have identified in your point (1), there is indeed a problem with the term magnetic dipole. Let's unpack the term starting with dipole.
The simplest system in electromagnetism is the electric point charge at rest, which produces a spherically symmetric $1/r$ potential (it's easier to work with the potential here). This is called a monopole field, and a point charge is a monopole.
Now consider two equal but opposite charges a distance $a$ from each other. This breaks rotational symmetry so the field will not be spherically symmetric, i.e., it is angle-dependent. Now expand the potential at a distance $r \gg a$ from the charges in powers of $1/r$. Since electromagnetism is linear, the $1/r$ terms are equal but opposite and cancel, but there remains a term of order $1/r^2$. Because the potential is a scalar, it must have the form $\mathbf p \cdot \mathbf r / r^3$ where $\mathbf p$ is a vector called the dipole moment. You can continue the expansion; the next term is $Q_{ij} x_i x_j / r^5$ where the tensor $Q_{ij}$ is the quadrupole moment, and so on. (The names reflect the minimum number of point charges you need to have a nonzero moment of that order: one for a monopole, two for a dipole, four for a quadrupole... All the details of this multipole expansion and what it's good for are in Jackson, of course.)
So by a dipole field we mean a field whose potential looks like $\mathbf p \cdot \mathbf r /r^3$, and its source is an electric dipole. Of course the magnetic field has a vector potential rather than a scalar potential, so by a magnetic dipole field we should mean one where the vector potential is like $\mathbf A \sim \mathbf m \times \mathbf r / r^3$, and its source is a magnetic dipole.
Now, one way to construct a magnetic dipole is by the obvious analogy: take two magnetic charges, i.e., two magnetic monopoles at a small distance. Well, that's easier said than done because no one has found any magnetic monpoles. We'll have to go with currents, then. Since the $1/r$ expansion is possible only if the size of the system is finite and charge is conserved, we'll have to use current loops, and conversely, any current loop will be a magnetic dipole.
In this sense and this sense only magnetic dipoles and current loops are equivalent. If you were to find some magnetic monopoles and arrange them such that the dipole moment is $\mathbf m$, then the force on the system is $$\mathbf F_\text{MM} = (\mathbf m \cdot \nabla) \mathbf B - \frac{1}{c}\dot{\mathbf m} \times \mathbf E$$
where $\dot{\mathbf m}$ is the time derivative,
whereas the force on a current loop is $$\mathbf F_\text{CL} = \nabla (\mathbf m \cdot \mathbf B) - \frac{1}{c}\frac{d}{dt} ( \mathbf m \times \mathbf E ).$$
If you expand $\mathbf F_\text{CL}$ and use Ampere's law with Maxwell's current, you see that these forces differ by $ k\mathbf m \times \mathbf J$ where $\mathbf J$ is the current density and $k$ is a constant that depends on your unit system. Clearly these magnetic dipoles are not equivalent.
(However, several authors erroneously calculate the force using the magnetic charge model and think it must be true for current loops, which, as shown by Vaidman, is not the case.)
There is one objection and that is that we know about spin, the intrinsic magnetic moment of particles such as electrons. I don't think it is obvious whether spin should be treated as a magnetic charge dipole, or as a current loop. For an elementary particle to be a current loop certainly seems strange, but it's not really less strange to think about it as a system of magnetic monopoles. One would think it's an experimental question, then, but Bohr and Pauli argued in the 20s that the spin of an individual electron is rather inaccessible to experiment, see Morrison, Margaret. "Spin: All is not what it seems." Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics 38.3 (2007): 529-557 for an account.) In any experiment with electrons, the Lorentz force would anyway dominate the magnetic dipole force, so one would have to turn to neutrons, which come with other difficulties. Vaidman discusses this briefly.
However, theoretically, if the situation is analyzed correctly, that is, using the the Foldy-Wouthuysen transformation (the orignal paper is Foldy, Leslie L., and Siegfried A. Wouthuysen. "On the Dirac theory of spin 1/2 particles and its non-relativistic limit." Physical Review 78.1 (1950): 29 which is a real gem and should be read by everyone studying quantum mechanics.) it is found that the current-loop model is correct. You can square it with the contradiction between current loop and elementary particle by realizing that this is a quantum mechanics thing, and in quantum mechanics you don't have to have point particles. In fact, you can't, by Heisenberg's principle. The electron is always a bit spread out, and in such a way as to produce a current loop magnetic dipole moment.
Best Answer
For the physical interpretation of the limit: imagine you had a current loop whose size you could decrease easily, like pulling on a drawstring. If you just make it smaller, you decrease the magnetic moment $\mu = IA$; to keep the field the same, you'd have to increase the current. If you cut the area $A$ of your loop in half, but doubled the current $I$, you'd have the same magnetic field far from the loop.
The ideal dipole source has zero size and infinite current, and the ideal dipole field is therefore infinitely strong at the origin. That's annoying. But we have the same problem with the monopole field, like the electric field from a point charge, which is proportional to $1/r^2$ and is infinite at the origin. For the electric field we get around this by inventing quantum mechanics and discovering that a "point charge" is not actually a thing that exists. The proton has a finite size; while the electron is a structureless "point particle," for computing its electric field you actually care about the finite charge density described by its wavefunction.
I might rephrase your second question as "what do we mean by 'large distance'?" Suppose you have a real cylindrical solenoid, made out of wires, with length $L$ and radius $R$. The dipole approximation is only good if your distance $r$ from the solenoid is much larger than $L$ or $R$. If you're inside the core of the solenoid you see a uniform field; if you are a tiny gnat tunneling through the wall you might prefer to treat the local field as due to a locally-flat sheet of current. The local field is complicated.
For a complicated field, it's helpful to describe it using a multipole expansion. I've already hinted at this by reminding you about the monopole field, which is produced by a point charge, and gets weaker like $1/r^2$ as you move away. If two opposite-sign charges are near each other, their monopole fields approximately cancel, and most of what's left over is described by the dipole field. The dipole field gets weaker with distance like $1/r^3$ — that's sort of what we mean when we say that the monopole fields approximately cancel out. The dipole field also has the more complicated shape, which leaks information about the orientation of the charges at the source.
Two back-to-back dipoles also approximately cancel out. What's left there is called a "quadrupole field," which gets weaker like $1/r^4$ and has an even more complicated shape than the dipole field. There's an infinite series of these higher-order corrections, which get weaker more rapidly as you move away.
A current loop with radius $R$ produces a magnetic field with nonzero dipole moment, but also nonzero quadrupole, octupole, hexadecapole, etc. moments, all of which are parameterized by $R$. If you move from $r$ to $2r$, the dipole field gets weaker by a factor of $2^3=8$, but the quadrupole field gets weaker by $2^4=16$. If you move many $R$ away (or equivalently, rebuild your current loop so that $R$ is very small), eventually only the dipole shape of the field will be measurable.