To the best of my knowledge, this has never been "officially" described in the set theoretic literature. This has been described by Blass and Scedrov in Freyd's models for the independence of the axiom of choice (Mem. Amer. Math. Soc. 79, 1989). (It is of course implicit and sometimes explicit in the topoi literature, for example Mac Lane and Moerdijk do a fair bit of the translation in Sheaves in Geometry and Logic.) There are certainly a handful of set theorists that are well aware of the generalization and its potential, but I've only seen a few instances of crossover. In my humble opinion, the lack of such crossovers is a serious problem (for both parties). To be fair, there are some important obstructions beyond the obvious linguistic differences. Foremost is the fact that classical set theory is very much a classical theory, which means that the double-negation topology on a site is, to a certain extent, the only one that makes sense for use classical set theory. On the other hand, although very important, the double-negation topology is not often a focal point in topos theory.
Thanks to the comments by Joel Hamkins, it appears that there is an even more serious obstruction. In view of the main results of Grigorieff in
Intermediate submodels and generic extensions in set theory, Ann. Math. (2) 101 (1975), it looks like the forcing posets are, up to equivalence, precisely the small sites (with the double-negation topology) that preserve the axiom of choice in the generic extension.
Yes, this is a model of ETCSR. Unfortunately, I don't know of a proof of this in the literature, which is in general sadly lacking as regards replacement/collection axioms in topos theory. But here's a sketch.
As Zhen says, the filterquotient construction preserves finitary properties such as Booleanness and the axiom of choice. Moreover, a maximal filterquotient will be two-valued. But as you point out, a nondegenerate two-valued topos satisfying the (external) axiom of choice is necessarily well-pointed; I wrote out an abstract proof at https://ncatlab.org/nlab/show/well-pointed+topos#boolean_properties. Thus, $\varinjlim_{U\ni s} \mathrm{Sh}_{\neg\neg}(S)$ is a model of ETCS.
As for replacement, the proof that I know (which is not written out in the literature) goes by way of the notions of "stack semantics" and "autology" in my preprint Stack semantics and the comparison of material and structural set theories (the other half, not the part that became the paper of mine cited in the question).
Briefly, stack semantics is an extension of the internal logic of a topos to a logic containing unbounded quantifiers of the form "for all objects" or "there exists an object". (My current perspective, sketched in these slides, is that this is a fragment of the internal dependent type theory of a 2-topos of stacks -- hence the name!) This language allows us to ask whether a topos is "internally" a model of structural set theories such as ETCS or ETCSR.
It turns out that every topos is "internally (constructively) well-pointed", and moreover satisfies the internal collection axiom schema. But the internal separation axiom schema is a strong condition on the topos, which I called being "autological". If a topos is autological and also Boolean, then the logic of its stack semantics is classical; thus it is internally a model of ETCSR. Since Grothendieck toposes are autological, your $\mathrm{Sh}_{\neg\neg}(S)$ is internally a model of ETCSR.
Now we can also prove that if $\mathcal{E}$ is Boolean and autological, so is any filterquotient of it. The idea is to prove a categorical version of Łoś's theorem for the stack semantics. (I don't know whether this is true without Booleanness, which annoys me to no end, but you probably don't care. (-: ) Therefore, $\varinjlim_{U\ni s} \mathrm{Sh}_{\neg\neg}(S)$ is also autological.
Finally, another fact about autology is that a well-pointed topos is autological if and only if it satisfies the ordinary structural-set-theory axiom schemas of separation and collection. Therefore, $\varinjlim_{U\ni s} \mathrm{Sh}_{\neg\neg}(S)$ satisfies these schemas, hence is a model of ETCSR.
However, I doubt that this particular filterquotient is related to forcing at all. The point is the same one that Jacob made in a comment: when set theorists force over a countable base model to make an "actual" new model, they find an actual generic ultrafilter outside that model. A generic ultrafilter in the base model would be a point of the topos $\mathrm{Sh}_{\neg\neg}(S)$, which as Andreas pointed out in a comment, does not exist. Your "points" $x$ are not points of the topos $\mathrm{Sh}_{\neg\neg}(S)$, so it's unclear to me whether filterquotients at them have anything to do with forcing.
Let me reiterate my argument that the real content of forcing is the internal logic of the topos $\mathrm{Sh}_{\neg\neg}(S)$. In particular, if you build a model of material set theory in this internal logic, what you get is essentially the Boolean-valued model that set theorists talk about. Edit: I think the rest of this answer is off-base; see the discussion in the comments. I'm pretty sure this is the best kind of "model" you can get if you don't want to start talking about countable models of ZFC sitting inside larger ambient models.
At the moment, my best guess for a topos-theoretic gloss on the countable-transitive-model version of forcing is something like the following. Suppose that $E$ is a countable model of ETCSR, containing an internal poset $P$, which we can equip with its double-negation topology. Then treating $E$ as the base topos, we can build $\mathrm{Sh}(P,E)$, a bounded $E$-indexed elementary topos (i.e. "$E$ thinks it is a Grothendieck topos"), which contains the Boolean-valued model associated to $P$ as described above. It is the classifying topos of $P$-generic filters, hence has in general no $E$-points.
But we also have the larger topos $\rm Set$ in which $E$ is countable, and we can consider the externalization $|P|$ which is a poset in $\rm Set$, namely $|P| = E(1,P)$. Then we can build the topos $\mathrm{Sh}(|P|,\rm Set)$ which "really is" a Grothendieck topos and classifies $|P|$-generic filters. The "Rasiowa–Sikorski lemma" implies that, since $E$ is countable, in this case such a filter does actually exist in $\rm Set$, so there is a point $p:\mathrm{Set} \to \mathrm{Sh}(|P|,\rm Set)$.
Now we should also have some kind of "externalization functor" $|-| : \mathrm{Sh}(P,E) \to \mathrm{Sh}(|P|,\rm Set)$. My guess is that the set-theorists' forcing model is the "image" (whatever that means) of the Boolean-valued model in $\mathrm{Sh}(P,E)$ under the composite of this externalization functor with the inverse image functor $p^* : \mathrm{Sh}(|P|,\rm Set) \to Set$. However, I have not managed to make this precise.
Best Answer
A solution to your first question is given in "Sketches of an Elephant" by Johnstone, Example A4.4.2(d). If a topos $\mathcal{E}$ is the topos of sheaves on some internal site in a topos $\mathcal{S}$, then there is a natural geometric morphism $p : \mathcal{E} \to \mathcal{S}$. It turns out that there are toposes that do not admit geometric morphisms to Boolean toposes whatsoever. Two examples given by Johnstone are the effective topos and the topos of triples $(A,B,f)$ where $A$ is a set, $B$ is a finite set, and $f : A \to B$ is a function.
So these toposes cannot be written as a topos of sheaves over a Boolean topos (so also not over a well-pointed topos, because well-pointed toposes are Boolean).
Below is the argument that a geometric morphism $p : \mathcal{E} \to \mathcal{S}$ does not exist, for $\mathcal{E}$ the topos of triples $(A,B,f)$ as above and $\mathcal{S}$ a Boolean topos. I follow the Elephant, Example A4.5.24. Because subtoposes of Boolean toposes are again Boolean, we can assume that $p$ is surjective, so $p^*$ is faithful. Further, in $\mathcal{S}$ all objects are decidable, i.e. the diagonal embeddings $X \to X \times X$ have a complement. So the same holds for objects of the form $(A,B,f) = p^*(X)$. This condition implies that $f$ is injective, and as a result both $A$ and $B$ are finite. We conclude that there are only finitely many morphisms $p^*(X) \to p^*(Y)$, and because $p^*$ is faithful, it follows that there are only finitely many morphisms $X \to Y$, for $X,Y$ arbitrary. However, for $(A,1,f)$ with $A$ infinite, we get that there are infinitely many morphisms $1 \to p_*(A)$, so this gives a contradiction.