No, that's not true at all.
Let's start with the most basic of meta-theories: $\sf PRA$, the theory of Primitive Recursive Arithmetic. Here we have the natural numbers, and the primitive recursive functions.
This theory is strong enough to internalise FOL, so we can just assume that we're manipulating strings. The point of an axiom schema is that it is a predicate that lets us recognise all the axioms which have a certain form. And any reasonable schema, including those of $\sf ZFC$, are in fact primitive recursive. In other words, there is a primitive recursive function $f_{\rm Sep}(n)$ which takes in $n$, and if $n$ is the Gödel number of a formula with some basic properties $\varphi$, then $f_{\rm Sep}(n)$ is the Gödel number of the axiom obtained from the schema by putting $\varphi$ into it. If $n$ is not the Gödel number of a suitable formula, just return the axiom for the formula $\varphi$ given by $x=x$ or something like that.
Since $\sf ZFC$ is presented as finitely many axioms and one or two schemata (Separation and Replacement, but Replacement is generally enough to prove Separation, making it redundant), then the collection of Gödel numbers of axioms of $\sf ZFC$ is in fact primitive recursive. So we can really talk about the first-order theory that is $\sf ZFC$.
To recap, the point of the schemata is to allow us to have infinitely many axioms with the same structure—that we can recognise mechanically—so that when we need to use any such axiom in a proof, we can always be sure that it is part of this or not.
Another way to approach this is by saying that our foundation is in fact $\sf ZFC$. We use set theory to discuss set theory. This sounds circular, but how is this any different than using $\sf PRA$ to study the logical consequences of $\sf PRA$? It's not. Mathematics is not something that we do in vacuum, some assumptions are needed. And it is perfectly fine to study $\sf ZFC$ inside $\sf ZFC$.
There, we have the notions of sets already existing, so we can talk about a set of axioms. Of course, we need to argue why a certain set exists, which is to say that we need to be able to prove that the set of these axioms actually exists. And again we use the fact that a schema is a function which takes formulas and returns axioms, so we may replace the schema with the axioms, as it is the range of this function.
So again, we have that $\sf ZFC$ is a set of first-order axioms in the language of set theory.
History, unfortunately, is more complicated than your choices permit.
Presumably, Zermelo had proposed his 1908 axioms in support of his well ordering theorem. In so far as his axioms might relate to intuitive notions of set, his original axioms admitted urelements and relied on singletons to substantiate the truth of equality statements. The idea that a theory of pure sets suffice for mathematics is attributed to Skolem as far as I know. With a little further consideration, this may be seen as assuming the arithmetization of mathematics. As such, one must recognize the chicken and egg problem that arises if linguistic symbols that are not pure sets become conflated with some Goedellian encoding for the sake of claiming that everything is a set.
Zermelo's reliance on singletons reflects the relationship of his system with Frege's logic. Category theorists claim that Zermelo had to make s choice between differing views expressed by Frege and Cantor. Since the relevant text is not translated into English, I cannot judge this claim. Nevertheless, Frege speaks of object identity and concept identity in a circular fashion that arises from Leibniz' principle of the identity of indiscernibles. Zermelo's use of singletons ought to be understood in this context. Foundationalism in the sense of Russell's logical atomism stipulates that object identity is fundamental. Hence, one has "first-order logic" with the proviso that "identity" is a "second-order" concept.
Since modern accounts of Zermelo-Fraenkel set theory are now based upon the first-order paradigm and the presumption that a theory of pure sets suffice for mathematics, urelements are treated in a derivative theory often called ZFA where the 'A' stands for (logical) atoms. The theory assumes that the urelements form a set (in conformance with the "everything is a set" presupposition). And, it is an important theory because of the models that arise from permuting the urelements.
It is, however, entirely plausible to understand urelements with respect to an antichain relation in a different signature.
Among other things to be found in Russell's writings, is the question of how a set is comparable to an object. While no one would ask that particular question today, the fact that urelements are not sets means that an antichain whose denial yields an order based on membership is entirely conceivable.
Such a theory would permit one to assert how non-sets satisfy the real number axioms, for example.
Understanding the distinction between urelements and sets in this manner permits one to understand how Cohen's forcing can extend a given model with "ideal subsets." The axiom of extension is contingent upon the denial of an antichain.
But, since such a view is not of interest to established mathematicians, students of set theory need to focus on the affirmative claims of first-order logic and set theory without concern for "intuition."
With regard to your second question, let me suggest that you look at the papers by Augustus de Morgan found in Ewald's two-volume anthology, "From Kant to Hilbert. " This will provide you with a good foundation for understanding what the assertion, "mathematics is formal," really means.
The received view corresponding with first-order model theory reduces mathematics to language signatures, stipulation of relations and operations involving the symbols of such signatures, and a specification of the arities for each symbol in the signature. These constructs are often called "similarity types."
Of course, the "first-order" designation goes back to the foundationalist stipulation that object identity is fundamental. So, language signatures will contain a symbol denoting an extensional domain of discourse. Extension is important to mathematical practice so that linguistic expressions intended to have "the same meaning" denote "singularly."
It had been Skolem's criticism of Zermelo's theory that gave impetus to this use of language signatures in mathematics over the notion that "mathematics is extensional." This particular claim arises from a different history. It is, of course, important for model theory. But, in the model theory of set theory it yields the "unfolding" and "geology" of possible models studied by set theorists using Cohen's forcing. What Skolem observed about Zermelo's axioms is that they could not be "categorical" in the sense of specifying a single, unique interpretation of the language symbols.
Best Answer
I think the narrative that sets are a primitive notion in ZFC is slightly incorrect. A more precise statement is that ZFC never discusses a notion of “set” at all. Instead, what is discussed is “set membership”; that is the primitive notion. Saying that ZFC defines “what a set is” or “how a set behaves” is, in my view, not correct.
At best, ZFC describes how the collection of all sets behaves.
An analogy would be the axioms of a group. These axioms define how the group itself behaves. They do not talk about the nature of group elements; they only discuss how group elements are related to each other by the group operation.
Similarly, ZFC describes how a universe of sets should behave. It does not discuss the nature of individual sets; it only discusses how sets are related to each other by the set membership predicate.