Given any partial order $P$, we see that $Sets^{P^{op}}$ is generated by the subobjects of 1. In fact, it is generated by the representable functors, which are all subobjects of 1 in this case.
Therefore, $Sh_{\neg \neg}(P)$ will also be generated by subobjects of 1. Recall that a Grothendieck topos generated by subobjects of 1 is known as a localic topos; for such a topos $T$, there is an equivalence between $T$ and $Sh(Sub_T(1))$; the latter is the topos of sheaves on the complete Heyting algebra (that is, the locale) of subobjects of $1$ in $T$. So another description of a localic topos is one isomorphic to the topos of sheaves on a locale.
Localic toposes have some nice properties. Assuming the axiom of choice, every Boolean localic topos satisfies the external axiom of choice - that every epimorphism has a section. Conversely, assuming excluded middle, this theorem implies the axiom of choice, since $Sets$ itself is a Boolean localic topos (the topos of sheaves on $2$). So throughout, we will assume the axiom of choice (which is fine if we're doing relative consistency with ZFC).
We fix a complete Boolean algebra $B$. We are interested in $Sh(B)$ and in $V^{(B)}$ and their internal logics.
There are numerous ways of getting a model of IZF from a Grothendieck topos. In this case, we use a classic technique of Fourman which involves internally constructing the set hierarchy within a cocomplete topos $T$. For a complete and rigorous exposition, see his article "Sheaf models for set theory".
We define, for each ordinal $\alpha$, a pair $(V_\alpha, \in_\alpha)$. $V_\alpha$ is an object in $Sh(B)$, and $\in_\alpha$ is a subobject of $V_\alpha^2$. We additionally have that $V$ acts functorialy; for $\alpha \leq \beta$, there is a map $i_{\alpha \leq \beta} : V_\alpha \to V_\beta$, and these maps are functorial. We also want to have each $i_{\alpha < \beta}$ be a monomorphism, and we want $\forall x, y \in V_\alpha (x \in_\alpha y \iff i_{\alpha \leq \beta}(x) \in_\beta i_{\alpha \leq \beta}(y))$ to hold in the internal logic.
The definitions of these are as follows (using internal logic notation):
- $V_0 = 0$, and $\in_0 = 0$
- $V_{\alpha + 1} = P(V_\alpha)$. Define $i_{\alpha \leq \alpha + 1}(x) = \{w \in V_\alpha \mid w \in_\alpha x\}$. Define $\in_{\alpha + 1} = \{(i_{\alpha \leq \alpha + 1}(x), y) \mid x \in_{V_\alpha} y\}$.
- For $\beta$ a limit ordinal: Define $V_\beta = colim_{\alpha < \beta} V_\alpha$. For $\alpha < \beta$, let $i_{\alpha \leq \beta}$ be the colimit map $V_\alpha \to colim_{\alpha < \beta} V_\alpha$. Finally, let $\in_\beta = \bigvee\limits_{\alpha < \beta} \{(i_{\alpha \leq \beta}(x), i_{\alpha \leq \beta}(y)) \mid x \in_\alpha y\}$.
Feel free to carefully double-check that all the claimed are satisfied. Also feel free to double-check various other niceties - for example, show in the internal logic that $\forall a, b \in V_\alpha (\forall c \in V_\alpha (c \in_\alpha a \iff c \in_\alpha b) \to a = b)$.
We now define the grand interpretation of set theory. For every list of variables $x_1, \ldots, x_n$, every statement $\phi(x_1, \ldots, x_n)$ over them, and every choice of ordinals $\beta_1, \ldots, \beta_n$, we define the subobject $\{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid \phi(x_1, \ldots, x_n)\}$ as follows:
- $\{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid \phi(x_1, \ldots, x_n) \land \psi(x_1, \ldots, x_n)\} = \{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid \phi(x_1, \ldots, x_n)\} \land \{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid \psi(x_1, \ldots, x_n)\}$ (and analogously for $\lor, \implies$)
- $\{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid \top\} = V_{\beta_1} \times \cdots \times V_{\beta_n}$
- $\{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid \bot\} = 0$
- $\{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid \exists y \phi(x_1, \ldots, x_n, y)\} = \bigvee\limits_{\gamma \in Ord} \exists_y \{(x_1, \ldots, x_n, y) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \times V_\gamma \mid \phi(x_1, \ldots, x_n, y)\}$ (and analogously for $\forall$)
- $\{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid x_i = x_j\} = \{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid i_{\beta_i \leq \max(\beta_i, \beta_j)}(x_i) = i_{\beta_j \leq \max(\beta_i, \beta_j)}(x_j)\}$
- $\{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid x_i \in x_j\} = \{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid i_{\beta_i \leq \max(\beta_i, \beta_j)}(x_i) \in_{\max(\beta_i, \beta_j)} i_{\beta_j \leq \max(\beta_i, \beta_j)}(x_j)\}$
In particular, then, a sentence $\phi$ gives rise to the subobject $\{- \in 1 \mid \phi\}$. The "truth values" of sentences therefore lie in the complete heyting algebra $Sub(T)$. We denote the truth value of $\phi$ by $||\phi||$. In fact, we slightly generalize this. Given $x_1 \in V_{\beta_1}(1), \ldots, x_n \in V_{\beta_n}(1)$, we write $||\phi(x_1, \ldots, x_n)||$ to denote the largest $U$ such that $(x_1, \ldots, x_n)|_U \in \{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid \phi(x_1, \ldots, x_n)\}(U)$.
One particular property which must be verified is that given $\beta_1 \leq \gamma_1, \ldots, \beta_n \leq \gamma_n$, the subobject $\{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid \phi(x_1, \ldots, x_n)\}$ is the pullback of $\{(y_1, \ldots, y_n) \in V_{\gamma_1} \times \ldots \times V_{\gamma_n} \mid \phi(y_1, \ldots, y_n)\}$ along the map $i_{\beta_1 \leq \gamma_1} \times \cdots \times i_{\beta_n \leq \gamma_n}$. This property is sufficient to see that for $x_1 \in V_{\beta_1}(1), \ldots, x_n \in V_{\beta_n}(1)$, $||\phi(x_1, \ldots, x_n)|| = ||\phi(i_{\beta_1 \leq \gamma_1}(x_1), \ldots, i_{\beta_n \leq \gamma_n}(x_n))||$.
Another property: consider $x \in V_\beta$ and $y \in V_{\gamma + 1}$. Then there exists a unique $t \in V_{\min(\beta, \gamma)}(||x \in y||)$ such that (1) $||x \in y|| \vdash t \in_{V_\beta} y$, and (2) $i_{\min(\beta, \gamma)}(t) = x|_{||x \in y||}$.
One can verify that all the laws of intuitionist logic hold for this interpretation, and that all axioms of IZF hold. If $T$ is Boolean, one can verify that moreover, all of ZF holds; if choice applies in $T$, choice will hold in its model of ZF, thus giving us a model of ZFC.
We now focus our attention again on the case $T = Sh(B)$, identifying $Sub_T(1)$ with $B$. We are now ready to define a family of maps $f_\beta : V^{(B)}_\beta \to V_\beta(1)$.
- $V^{(B)}_0 = \emptyset$, so $f_0$ is trivial.
- $f_{\beta + 1}(x)$ should give us a subsheaf of $V_\beta$. This is the subsheaf such that $f_{\beta + 1}(x)(U) = \{y \in V_\beta(U) \mid U = \bigvee \{V \leq U \mid \exists z \in V^{(B)}_\beta (f_\beta(z)|_V = y|_V \land V \leq x(z)\}\}$. This can also be described as the subsheaf generated by $\{f_\beta(z)|_{x(z)}\}$.
- For $\beta$ a limit ordinal, $\alpha < \beta$, and $x \in V^{(B)}_\alpha$, $f_\beta(x) = i_{\alpha \leq \beta}(1)(f_\alpha(x))$
It is clear from the definition that for $y \in V^{(B)}_{\beta + 1}$ and $t \in V^{(B)}_\beta$, we have $f_\beta(t)|_{y(t)} \in f_{\beta + 1}(y)(y(t))$. So we have $y(t) \leq ||f_\beta(t) \in f_{\beta + 1}(y)||$.
It also follows from the definition that each $f_{\beta + 1}$ is surjective. In fact, we can define a one-sided inverse to $f_{\beta + 1}$ by sending each $x \in V_{\beta + 1}(1)$ to $u \mapsto \bigvee \{U \in B \mid f_\beta(u)|_U \in x(U)\}$.
Theorem. For all $\phi(x_1, \ldots, x_n)$, for all ordinals $\beta_1, \ldots, \beta_n$, for all $x_1 \in V^{(B)}_{\beta_1}, \ldots, x_n \in V^{(B)}_{\beta_n}$, $||\phi(x_1, \ldots, x_n)|| = ||\phi(f_{\beta_1}(x_1), \ldots, f_{\beta_n}(x_n))||$.
The proof is by induction, first on $\phi$, then on the ordinals. The cases $\phi = \bot, \top$ are trivial. Even more trivial is the case where some $\beta_i = 0$; then, there is no choice $x_i \in V^{(B)}_{\beta_i}$ to speak of. Also trivial are the cases where any $\beta_i$ is a limit ordinal, since we would take some $\alpha < \beta_i$ such that $x_i \in V^{(B)}_\alpha$ and note that $||\phi(x_1, \ldots, x_i, \ldots, x_n)|| = ||\phi(f_{\beta_1}(x_1), \ldots, f_{\alpha}(x_i), \ldots, f_{\beta_n}(x_n))|| = ||\phi(f_{\beta_1}(x_1), \ldots, i_{\alpha \leq \beta_i}(f_{\alpha}(x_i)), \ldots, f_{\beta_n}(x_n))|| = ||\phi(f_{\beta_1}(x_1), \ldots, f_{\beta_i}(x_i), \ldots, f_{\beta_n}(x_n))||$.
It is fairly straightforward to use the inductive hypothesis appropriately for when $\phi(x_1, \ldots, x_n) = \psi(x_1, \ldots, x_n) \square \theta(x_1, \ldots, x_n)$ for $\square \in \{\land, \lor, \implies\}$.
The existential quantification is rather intriguing. One direction of the proof is trivial. We clearly have $||\exists y (\phi(x_1, \ldots, x_n, y))|| = \bigvee\limits_{\gamma, y \in V^{(B)}_\gamma} ||\phi(x_1, \ldots, x_n, y)|| = \bigvee\limits_{\gamma, y \in V^{(B)}_\gamma} ||\phi(f_{\beta_1}(x_1), \ldots, f_{\beta_n}(x_n), f_\gamma(y))|| \leq \bigvee\limits_{\gamma, z \in V_\gamma} ||\phi(f_{\beta_1}(x_1), \ldots, f_{\beta_n}(x_n), z|| = ||\exists y(\phi(f_{\beta_1}(x_1), \ldots, f_{\beta_n}(x_n), y))||$.
Conversely, the map $\gamma \mapsto \exists_y \{(x_1, \ldots, x_n, y) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \times V_\gamma \mid \phi(x_1, \ldots, x_n, y)\}$ is monotonically increasing. Therefore, by a size argument, there exists some $\gamma$ such that $\exists_y \{(x_1, \ldots, x_n, y) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \times V_\gamma \mid \phi(x_1, \ldots, x_n, y)\} = \{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid \exists y \phi(x_1, \ldots, x_n, y)\}$. It follows that the projection map $\{(x_1, \ldots, x_n, y) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \times V_\gamma \mid \phi(x_1, \ldots, x_n, y)\} \to \{(x_1, \ldots, x_n) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \mid \exists y \phi(x_1, \ldots, x_n, y)\}$ is an epimorphism. Let $U = ||\exists y (\phi(x_1, \ldots, x_n))||$. By the external axiom of choice, it is possible to choose some $y \in V_\gamma(U)$ such that $(x_1|_U, \ldots, x_n|_U, y) \in \{(x_1, \ldots, x_n, y) \in V_{\beta_1} \times \cdots \times V_{\beta_n} \times V_\gamma \mid \phi(x_1, \ldots, x_n, y)\}(U)$. Pick some $z \in V_{\gamma + 1}(1)$ such that $z|_U = i_{\gamma \leq \gamma + 1}(y)$; write $z = f_{\gamma + 1}(w)$ for some $w \in V^{(B)}_{\gamma + 1}$. Then $U \leq ||\phi(f_{\beta_1}(x_1), \ldots, f_{\beta_n}(x_n), f_{\gamma + 1}(w))|| = ||\phi(x_1, \ldots, x_n, w)|| \leq ||\exists y (\phi(x_1, \ldots, x_n, y))||$.
We skip the universal quantifier entirely, since in classical logic, it is redundant (and both the sheaf model and the Boolean-valued model are classical).
We are left with the simplest base case; the propositions $x \in y$ and $x = y$, where $x \in V^{(B)}_{\delta + 1}$ and $y \in V^{(B)}_{\epsilon + 1}$. It turns out we have to do these inductions together, since each proposition depends on the other. Note that $||x \in y|| = \bigvee_{t \in V^{(B)}_\epsilon} y(t) \land ||x = t||$. Now for each $t \in V^{(B)}_\epsilon$, we see that $y(t) \land ||x = t|| = y(t) \land ||f_{\delta + 1}(x) = f_\epsilon(t)|| \leq ||f_\epsilon(t) \in f_{\epsilon + 1}(y)|| \land ||f_{\delta + 1}(x) = f_\epsilon(t)|| = ||f_\epsilon(T) \in f_{\epsilon + 1}(y) \land f_{\delta + 1}(x) = f_\epsilon(t)|| \leq ||f_{\delta + 1}(x) \in f_{\epsilon + 1}(y)||$, using the fact that the Fourman semanics are complete under deduction.
In the other direction, let $u = ||f_{\delta + 1}(x) \in f_{\epsilon + 1}(y)||$. We take some $t \in V_{\min(\delta + 1, \epsilon)}$ such that $t \in y(U)$ and $i_{\min(\delta + 1, \epsilon) \leq \delta + 1)}(t) = f_{\delta + 1}(x)|_U$. Then by the definition of $f_{\epsilon + 1}$, we have $U = \bigvee \{V \leq U \mid \exists z \in V^{(B)}_\epsilon (f_\epsilon(z)|_V = t|_V \land V \leq y(z))\}$. This is enough to conclude that $U \leq ||x \in y||$, which completes the inductive step for $\in$. A similar inductive step proof exists for $=$; I may come back and add it in later, but I've already been working on this for a few hours and I'm getting a bit tired.
So Boolean-valued model semantics is just the semantics of the Fourman model derived from $Sh(B)$.
One final note here. We generally work with the internal logic of $Sh(B)$ without trying to leverage this into any sort of theorem about ZFC directly. Sheaves in Geometry and Logic presents the internal logic of a topos as higher-order typed logic where all quantifiers are bounded. Shulman extends this notion to allow for unbounded quantifiers. This extension of the rules of forcing allows one to describe the semantics of $X \Vdash \phi$, where $X$ is an object in the topos and $\phi$ is some statement in the language of category theory which does not discuss equality of objects. A statement $\phi$ is globally true if $1 \Vdash \phi$. See "Stack semantics and the comparison of material and structural set theories".
In that paper, Shulman demonstrates that for any sentence $\phi$ in the category of sets and any Grothendieck topos which satisfies the internal axiom of choice, $||\phi'|| = 1$ if and only if $\Vdash \phi$. Here, $\phi'$ is the canonical statement in the language of set theory which says "$\phi$ holds in the category of sets", and $||\phi'||$ is the Fourman semantics described above.
Combining all these things, we can observe that Mac Lane and Moerdijk came up with a Grothendieck topos which satisfies choice but also satisfies the rejection of the continuum hypothesis, as expressed in higher-order logic and interpreted through the internal logic. This topos is equivalent to $Sh(B)$ for a specific complete Boolean algebra $B$, so in this topos, choice holds but not the continuum hypothesis in the internal logic. Shulman's result then proves that in the Fourman semantics, we must also have choice holding and the continuum hypothesis failing. Finally, the above result means that the Fourman semantics of $Sh(B)$ are equivalent to the classic semantics of the Boolean-valued model on $B$.
I would personally like to add that I generally consider Shulman's stack semantics to be more elegant and easier to work with than the Fourman semantics. Shulman's stack semantics are also far more general - they apply not only to any elementary topos but also to weaker kinds of categories which have some features of a topos but not others, and they are totally constructive. However, in this instance, the less elegant Fourman semantics are needed since they are closer to the even less elegant Boolean-valued model semantics.
Best Answer
Yes, the proof can be rewritten to work.
New proof:
Proof: we proceed by induction on $n$. The base case $n = 0$ is trivial, since there is only one $q \in \mathbb{P}$ such that $|F_q| = 0$.
For the inductive step, we suppose that for all sets $B$ of incompatible conditions, each with $k$ elements, $B$ is countable.
We now consider a set $A$ of incompatible conditions, each with $k + 1$ elements. If $A$ is empty, then we're done. Otherwise, take some $p \in A$, and enumerate $F_p = \{x_0, \ldots, x_k\}$.
For each $j \leq k$, define $S_j = \{q \in A \mid x_j \in F_q$ and $q(x_j) \neq p(x_j)\}$. Then $A = \{p\} \cup \bigcup\limits_{i = 0}^k S_i$, so it suffices to show that each $S_i$ is countable.
Now for each $j \leq k$, define $R_j = \{q|_{F_q \setminus \{x_j\}} \mid q \in S_j\}$. Note that the restriction map $q \mapsto q|_{F_q \setminus \{x_j\}} : S_j \to R_j$ is a bijection, since we can recover $q$ from $q|_{F_q \setminus \{x_j\}}$ because we know that $q(x_j) \neq p(x_j)$, so there's only one possible value for $q(x_j)$. So it suffices to prove that each $R_j$ is countable.
Now given two conditions $r \neq q \in S_j$, we know that $r$ and $q$ are incompatible. Therefore, there is some $w \in F_q \cap F_r$ such that $q(w) \neq r(w)$. But note that $q(x_j) = r(x_j)$, and therefore we have $w \neq x_j$. That is, $w \in (F_q \setminus \{x_j\}) \cap (F_r \setminus \{x_j\})$. Therefore, $w$ witnesses that $q|_{F_q \setminus \{x_j\}}$ and $r|_{F_r \setminus \{x_j\}}$ are incompatible. So $R_j$ is a set of incompatible conditions, each of size $k$. Therefore, $R_j$ is countable by the inductive hypothesis.
Proof: $A = \bigcup\limits_{n \in \mathbb{N}} A_n$ where $A_n = \{p \in A \mid |F_p| = n\}$, and each $A_n$ is countable. So $A$ is also countable.