The answer is no. Let ZC' be ZFC without replacement and infinity and with the assertion there is a Kuratowski infinite set. We will construct a model $M$ of ZC' such that only hereditarily finite elements of $M$ are fixed under all automorphisms of $M.$ The idea is generate a model from a $\mathbb{Z}^2$-array of objects, each of whose only element is the object below it in the array.
Define $M=\bigcup_{m<\omega} \bigcup_{n \in \mathbb{Z}} \mathcal{P}^m(\{\omega\} \times \mathbb{Z} \times [n, \infty)),$ where we adjust the $\mathcal{P}$ operator to replace each singleton of the form $\{(\omega, n_1, n_2)\}$ with $(\omega, n_1, n_2+1).$ (We use $\omega$ to differentiate the $\mathbb{Z}^2$ objects from the hereditarily finite sets).
Define a relation $E$ on $M$ by $E \restriction \mathbb{Z}^2=\{((\omega, n_1, n_2),(\omega, n_1, n_2+1)): n_1, n_2 \in \mathbb{Z}\}.$ Extending $E$ to the iterated power sets is done in the natural way. It is easy to see $(M, E)$ satisfies ZC'.
Notice that the map $(\omega, n_1, n_2) \mapsto (\omega, n_1, n_2+1)$ extends to a unique automorphism on $M,$ which only fixes hereditarily finite elements. Thus, none of the infinite sets in the model are first-order definable.
Update: The author of this question was curious whether there is such a model which also satisfies transitive containment (TC), the assertion that every set has a transitive closure. The answer is yes; here's a sketch of such a model.
Assume ZFC + "there are non-OD reals" in $V$ (or work in a ctm satisfying a sufficiently large finite fragment of this theory). We'll build an OD model $M$ of ZC'+TC such that, from every infinite element $x,$ we can define (in $V$) a non-OD real $r.$ Then $x$ is not definable in $M,$ or else we would have an ordinal definition for $r.$
For $r \in \mathbb{R} = \mathcal{P}(\omega),$ we recursively define $n_r$ by $0_r=\emptyset$ and $(n+1)_r =\{n_r\}$ if $n \not \in r,$ and $(n+1)_r = \{n_r, \emptyset\}$ if $n \in r.$ Basically, we're defining a new system of natural numbers for each real.
Let $M = \bigcup_{m=0}^{\infty} \bigcup_{S \in [\mathbb{R} \setminus OD]^{<\omega}} \mathcal{P}^m ( \{n_r: n<\omega, r \in S\} ).$
For each infinite $x,$ there are finitely many (and at least one) non-OD reals $r$ such that the transitive closure of $x$ contains every $n_r.$ The least such real is as desired. And it's routine to verify $M \models ZC'+TC.$
Yes, this is a model of ETCSR. Unfortunately, I don't know of a proof of this in the literature, which is in general sadly lacking as regards replacement/collection axioms in topos theory. But here's a sketch.
As Zhen says, the filterquotient construction preserves finitary properties such as Booleanness and the axiom of choice. Moreover, a maximal filterquotient will be two-valued. But as you point out, a nondegenerate two-valued topos satisfying the (external) axiom of choice is necessarily well-pointed; I wrote out an abstract proof at https://ncatlab.org/nlab/show/well-pointed+topos#boolean_properties. Thus, $\varinjlim_{U\ni s} \mathrm{Sh}_{\neg\neg}(S)$ is a model of ETCS.
As for replacement, the proof that I know (which is not written out in the literature) goes by way of the notions of "stack semantics" and "autology" in my preprint Stack semantics and the comparison of material and structural set theories (the other half, not the part that became the paper of mine cited in the question).
Briefly, stack semantics is an extension of the internal logic of a topos to a logic containing unbounded quantifiers of the form "for all objects" or "there exists an object". (My current perspective, sketched in these slides, is that this is a fragment of the internal dependent type theory of a 2-topos of stacks -- hence the name!) This language allows us to ask whether a topos is "internally" a model of structural set theories such as ETCS or ETCSR.
It turns out that every topos is "internally (constructively) well-pointed", and moreover satisfies the internal collection axiom schema. But the internal separation axiom schema is a strong condition on the topos, which I called being "autological". If a topos is autological and also Boolean, then the logic of its stack semantics is classical; thus it is internally a model of ETCSR. Since Grothendieck toposes are autological, your $\mathrm{Sh}_{\neg\neg}(S)$ is internally a model of ETCSR.
Now we can also prove that if $\mathcal{E}$ is Boolean and autological, so is any filterquotient of it. The idea is to prove a categorical version of Łoś's theorem for the stack semantics. (I don't know whether this is true without Booleanness, which annoys me to no end, but you probably don't care. (-: ) Therefore, $\varinjlim_{U\ni s} \mathrm{Sh}_{\neg\neg}(S)$ is also autological.
Finally, another fact about autology is that a well-pointed topos is autological if and only if it satisfies the ordinary structural-set-theory axiom schemas of separation and collection. Therefore, $\varinjlim_{U\ni s} \mathrm{Sh}_{\neg\neg}(S)$ satisfies these schemas, hence is a model of ETCSR.
However, I doubt that this particular filterquotient is related to forcing at all. The point is the same one that Jacob made in a comment: when set theorists force over a countable base model to make an "actual" new model, they find an actual generic ultrafilter outside that model. A generic ultrafilter in the base model would be a point of the topos $\mathrm{Sh}_{\neg\neg}(S)$, which as Andreas pointed out in a comment, does not exist. Your "points" $x$ are not points of the topos $\mathrm{Sh}_{\neg\neg}(S)$, so it's unclear to me whether filterquotients at them have anything to do with forcing.
Let me reiterate my argument that the real content of forcing is the internal logic of the topos $\mathrm{Sh}_{\neg\neg}(S)$. In particular, if you build a model of material set theory in this internal logic, what you get is essentially the Boolean-valued model that set theorists talk about. Edit: I think the rest of this answer is off-base; see the discussion in the comments. I'm pretty sure this is the best kind of "model" you can get if you don't want to start talking about countable models of ZFC sitting inside larger ambient models.
At the moment, my best guess for a topos-theoretic gloss on the countable-transitive-model version of forcing is something like the following. Suppose that $E$ is a countable model of ETCSR, containing an internal poset $P$, which we can equip with its double-negation topology. Then treating $E$ as the base topos, we can build $\mathrm{Sh}(P,E)$, a bounded $E$-indexed elementary topos (i.e. "$E$ thinks it is a Grothendieck topos"), which contains the Boolean-valued model associated to $P$ as described above. It is the classifying topos of $P$-generic filters, hence has in general no $E$-points.
But we also have the larger topos $\rm Set$ in which $E$ is countable, and we can consider the externalization $|P|$ which is a poset in $\rm Set$, namely $|P| = E(1,P)$. Then we can build the topos $\mathrm{Sh}(|P|,\rm Set)$ which "really is" a Grothendieck topos and classifies $|P|$-generic filters. The "Rasiowa–Sikorski lemma" implies that, since $E$ is countable, in this case such a filter does actually exist in $\rm Set$, so there is a point $p:\mathrm{Set} \to \mathrm{Sh}(|P|,\rm Set)$.
Now we should also have some kind of "externalization functor" $|-| : \mathrm{Sh}(P,E) \to \mathrm{Sh}(|P|,\rm Set)$. My guess is that the set-theorists' forcing model is the "image" (whatever that means) of the Boolean-valued model in $\mathrm{Sh}(P,E)$ under the composite of this externalization functor with the inverse image functor $p^* : \mathrm{Sh}(|P|,\rm Set) \to Set$. However, I have not managed to make this precise.
Best Answer
Your question is much more specific than your title suggests. As to the question itself, my answer is that it doesn't matter. The proof is given in mathematics, not in any formal system. A foundation for mathematics, in order to count as a foundation of mathematics, must be able to formalize most ordinary mathematical arguments. Thus, any foundation for mathematics could be used to formalize such a metatheorem.
With that said, some people care about whether metatheorems of this sort can be formalized in very weak theories. There's nothing wrong with that, but it's not something I spend my time thinking about, and in particular I didn't think about it when writing the proof you refer to. The closest I came was noting that instead of a "construction of a model" the proof can equivalently be regarded as giving a translation of first-order formulas. If you're interested in this sort of question, then Nik's answer seems reasonable to me, but I'm not familiar with the details of how this sort of thing goes.