The categories of models with elementary embeddings are accessible categories. (The cardinal κ is related to the size of the language via Löwenheim-Skolem; the κ-presentable, aka κ-compact, objects are models of size less than κ.) Michael Makkai and Bob Paré originally describe this idea in Accessible categories: the foundations of categorial model theory (Contemporary Mathematics 104, AMS, 1989). However, still more can be found in later works such as Adámek and Rosický, Locally presentable and accessible categories (LMS Lecture Notes 189, CUP, 1994).
More generally, abstract elementary classes can also be viewed as accessible categories. Thus accessible categories include categories of models of infinitary theories, theories with generalized quantifiers, etc. In fact, accessible categories can always be attached to such structures, but I don't know the exact characterization of the categories that arise from models of theories of first-order logic. The Yoneda embedding can sometimes be used to attach first-order models to accessible categories, such as when the accessible category is strongly categorical (Rosický, Accessible categories, saturation and categoricity, JSL 62, 1997). On the other hand, you can reformulate a lot of model theoretic concepts in general accessible categories. There are more than a few kinks along the way and not all of it has been done, but the more I learn the more I find that this is actually a very interesting and powerful way to approach model theory.
Let me try to explain the situation in greater detail. I guess the correspondences are better explained in terms of sketches. (This nLab page needs expansion; Adámek and Rosický give a nice account of sketches; another account can be found in Barr and Wells.) A sketch asserts the existence of certain limits and colimits, or just limits in the case of a limit sketch, taken together these assertions can be formulated as a sentence in L∞,∞ (sketchy details below). Like such sentences, every sketch S has a category Mod(S) of models. Sketches and accessible categories go hand in hand.
If S is a sketch, then Mod(S) is an accessible category, and every accessible category is equivalent to the category of models of a sketch.
If S is a limit sketch, then Mod(S) is a locally presentable category, and very locally presentable category is equivalent to the category of models of a limit sketch.
When translated into L∞,∞, a limit sketch becomes a theory with axioms of the form
$\forall\bar{x}(\phi(\bar{x})\to\exists!\bar{y}\psi(\bar{x},\bar{y})),$
where $\phi$ and $\psi$ are conjunction of atomic formulas (and the variable lists $\bar{x}$ and $\bar{y}$ can be infinite). When the category is locally finitely presentable, then these axioms can be stated in Lω,ω. Theories with axioms of this type are essentially characterized by the fact that Mod(T) has finite limits.
If T is a theory in Lω,ω and Mod(T) which is closed under finite limits (computed in Mod(∅)), then Mod(T) is locally finitely presentable category (and hence finitely admissible).
Every locally finitely presentable category is equivalent to a category Mod(T) where T is a limit theory in Lω,ω (i.e. with axioms as described above).
It is natural to conjecture that this equivalence continues when ω is replaced by ∞. Adámek and Rosický have shown in A remark on accessible and axiomatizable theories (Comment. Math. Univ. Carolin. 37, 1996) is that for a complete category being equivalent equivalent to a (complete) category of models of a sentence in L∞,∞ and being accessible are equivalent provided that Vopenka's Principle holds. In fact, this equivalence is itself equivalent to Vopenka's Principle. (It is apparently unknown whether accessible can be strengthened to locally presentable.)
Now, if T is a sentence in L∞,∞, then the category Elem(T) (models of T under elementary embeddings) is always an accessible category. The category Mod(T) is unfortunately not necessarily accessible. When translated into L∞,∞ sketches become sentences of a special form. A formula in L∞,∞ is positive existential if it has the form
$\bigvee_{i \in I} \exists\bar{y}_i \phi_i(\bar{x},\bar{y}_i)$
where each $\phi_i$ is a conjunction of atomic formulas. A basic sentence in L∞,∞ is conjunction of sentences of the form
$\forall\bar{x}(\phi(\bar{x})\to\psi(\bar{x}))$
where $\phi$ and $\psi$ are positive existential formulas.
- A category is accessible if and only if it is equivalent to a category Mod(T) where T is a basic sentence in L∞,∞.
It would be great if one could simply replace accessible by finitely accessible and sentence in L∞,∞ by theory in Lω,ω, as in the locally presentable case above. Unfortunately, this is simply not true. The category of models of the basic sentence $\forall x\exists y(x \mathrel{E} y)$ in the language of graphs is accessible but not finitely accessible. A counterexample in the other direction is the category of models of $\bigvee_{n<\omega} f^{n+1}(a) = f^n(a)$, which is finitely accessible but not axiomatizable in Lω,ω.
You are comparing apples and organges. Model theory should be compared with categorical logic, not category theory. Conversely, category theory should be compared with algebra, not model theory.
Model theory is the study of set-theoretic models of theories expressed in first-order classical logic. As such it is a particular branch of categorical logic, which is the study of models of theories, without insistence on set theory, first order, or classical reasoning.
Best Answer
I take your question to be about what we might call the structuralist perspective, the view that we specify mathematical objects and structures by their defining structural features, ignoring any internal or otherwise irrelevant structure that an instantiation of the object might exhibit. You perceive a tension between this view and the pure theory of sets, in which every set carries its hereditary $\in$-structure. You propose that the concept of urelements---objects that are not sets but which can be elements of sets---provide exactly what is needed to implement the structuralist perspective, for because urelements have no internal set-theoretic structure, there would seem to be nothing to ignore. So the plan appears to be for us to present the natural numbers as given canonically by urelements and thereby hope to finesse any need to engage the structuralist perspective directly.
But this strategy doesn't actually succeed, does it, since someone might permute the urelements---swap two of them, say---and thereby build a perfectly good copy of the natural numbers, still made from urelements. If the urelements were supposed to provide for you a canonical concept of the natural numbers, then you would have a canonical number $5$, but which urelement will you say is the real number $5$? Similarly, as you mention, we might swap the "dots" in your question. So even when we build our structures from urelements, the structuralist issue still arises. But the point of having them, if I understand you correctly, was to avoid that issue.
Secondly, urelements are often described as distinct but indistiguishable, each having all the same properties as the others. But this is problematic, since an urelement $x$ is the only urelement that has the property of being $x$, as well as the only element of $\{x\}$ and so on. Perhaps that urelement is also my favorite urelement! Or perhaps it was created first among all the urelements, whatever that might mean, or perhaps it even does have a secret internal, irrelevent mathematical but not set-theoretic structure that is hidden from our knowledge and which remains inaccessible to us. You might reply that all these are features of urelements that you want to ignore---they are irrelevant---but this would simply be admitting that you haven't avoided the structuralist issue with urelements.
I take these issues to show that urelements don't actually help us avoid the need to engage with the structuralist perspective directly. We want to adopt the structuralist view, and to specify our mathematical objects by their defining structural features rather than by the essential nature of their constituent objects.
The urelement concept arises naturally from two views in naive set theory, first, the view that one must have some objects before it is sensible to speak of sets of objects, and second, the view that set theory is essentially a supplemental theory, built on top of other mathematical theories, providing assistance in theoretic argument. One first has the natural numbers, for example, whatever they are, and then one may consider sets of natural numbers and sets of these sets and so on, and the same for real numbers, and these sets assist with the original mathematical analysis.
Set theorists quickly realized, however, that the structuralist perspective allowed them to abandon any need for the urelements---all the favorite mathematical structures can be constructed out of pure sets. Set theory proceeds in a pure, elegant development without urelements, and set theorists adopt the structuralist perspective wholesale. (What is a set, really? I don't care---but I care about the structure of its $\in$-relations to the other sets.) Even the urelements themselves can be simulated by finding structural copies of them within the pure set theory, just as we construct the integers and the real field.
In this way, both of the naive views mentioned two paragraphs back are overturned: the cumulative hierarchy of sets arises from nothing, towering higher than we can imagine, while providing the desired instances of all of our favored mathematical structures. This is the sense in which set theory unifies mathematics, by providing a common forum in which we can view all other mathematical arguments as taking place.
Lastly, let me mention that the idea of permuting urelements gave rise to the earliest consistency proofs of $\neg AC$. One begins with a model of ZFA, and then fixes a group of permutations of the urelements, restricting to the universe of sets that hereditarily respect that group action. It can be arranged that the resulting symmetric model satisfies $ZFA+\neg AC$, and so we arrive at models without the axiom of choice. It was not known how to do this in a pure set theory until Cohen introduced the forcing technique. Nevertheless, the Jech-Sochor embedding theorem shows that every initial segment of a permutation model of ZFA has a copy as a permutation model of ZF, in the pure theory, in which the iterated power set structure of the atoms is respected up to that bound. This theorem therefore simultaneously redeems the early approach to $\neg AC$ using urelements, while also showing that the method was not necessary for that application.
Apologies for this long answer...