Elementary Set Theory – Intuition Behind Cantor-Bernstein-Schröder

elementary-set-theoryfunctionsintuition

The book I am working from (Introduction to Set Theory, Hrbacek & Jech) gives a proof of this result, which I can follow as a chain of implications, but which does not make natural, intuitive sense to me. At the end of the proof, I found myself quite confused, and had to look carefully at the build-up to see how the conclusion followed. I get the steps, now – but not the intuition.


The authors took sets $X$ and $Y$ and assumed injections $f: X \rightarrow Y$, $g: Y \rightarrow X$. Since $X \sim g[f[X]]$ and $X \supseteq g[Y] \supseteq g[f[X]]$, and $Y \sim g[Y]$, the authors went to prove the lemma that $A \sim A_1, A \supseteq B \supseteq A_1 \implies A \sim B$.

For the lemma, they defined recursive sequences $\{A_n\}_{n \in \omega}, \{B_n\}_{n \in \omega}$ by $A_0 = A, A_{n+1} = f[A],$ $ B_0 = B, B_{n+1} = f[B].$ Since $A_0 \supseteq B_0 \supseteq A_1 \implies f[A_0] \supseteq f[B_0] \supseteq f[A_1]$, we get from induction that $A_{n+1} \subseteq A_n$. Putting $\{C_n\}_{n \in \omega} = \{A_n – B_n\}_{n \in \omega},$ they noted that $C_{n+1} = f[C_n]$ (since $A_n \supseteq B_n$ inductively, $f[A_n – B_n] = f[A_n] – f[B_n]$). Putting

$$ C = \bigcup_{n=0}^\infty C_n, \text{ } D = A – C,$$

they noted that $f[C] = \bigcup_{n=1}^\infty C_n$, and that $h(x): A \rightarrow f[C] \cup D$ can be defined by sending $x \in C \rightarrow f(x),$ and $x \in D \rightarrow x$. It is clear that $h$ is one-to-one and onto.

Inductively, for $n > 0$, $C_0 \cap C_n = \varnothing$; it follows that $C_0 \cap \bigcup_{n=1}^\infty C_n = C_0 \cap f[C] = \varnothing$. We know that $C_0 \cup f[C] \cup D$ = A, and since all three sets are disjoint, we may conclude that $f[C] \cup D = A – C_0 = A – (A – B) = B$. Thus, our bijection $h$ maps $A$ to $B$, as we wanted.


What is the intuition here? What were H&J, or Cantor, Bernstein, etc. thinking when they went to prove this – the "high-level" idea? Is there a clearer proof, in the sense of thought process (not necessarily shorter)?

Best Answer

$\newcommand{\ran}{\operatorname{ran}}$Here’s one way to think about it. Suppose that $y\in\ran f$; then we can pull $y$ back to $f^{-1}(y)\in X$. If $f^{-1}(y)\in\ran g$, we can pull it back to $g^{-1}(f^{-1}(y))\in Y$. If we continue this pulling back, one of two things must happen: either we reach a dead end at a point of $X$ or $Y$ that can’t be pulled back (because it’s in $Y\setminus\ran f$ or $X\setminus\ran g$), or we don’t.

Let $X_0=X\setminus\ran g$, the set of points of $X$ that cannot be pulled back at all, and let $Y_0=Y\setminus\ran f$. More generally, for each $n\in\omega$ let $X_n$ be the set of points of $X$ that can be pulled back exactly $n$ times, and let $Y_n$ be the set of points of $Y$ that can be pulled back exactly $n$ times. Finally, let $X_\omega$ and $Y_\omega$ by the subsets of $X$ and $Y$, respectively whose points can be pulled back infinitely many times.

At this point a sketch helps; it should show the partitions $\{X_n:n\le\omega\}$ of $X$ and $\{Y_n:n\le\omega\}$ of $Y$, and it should include arrows indicating what parts of $X$ get mapped to what parts of $Y$ and vice versa. To avoid having arrows crossing, I’ve taken $X$ and $Y$ apart in the following diagram.

$$\begin{array}{} X_0&\overset{f}\longrightarrow&Y_1&\overset{g}\longrightarrow&X_2&\overset{f}\longrightarrow& Y_3&\overset{g}\longrightarrow&X_4&\dots&X_\omega\\ Y_0&\overset{g}\longrightarrow&X_1&\overset{f}\longrightarrow&Y_2&\overset{g}\longrightarrow&X_3&\overset{f}\longrightarrow&Y_4&\dots&Y_\omega \end{array}$$

Each of the arrows is a bijection, so I can break up the diagram into $\omega$ self-contained parts. The first two parts are:

$$\begin{array}{} X_0&\overset{f}\longrightarrow&Y_1\\ Y_0&\overset{g}\longrightarrow&X_1 \end{array}\qquad \begin{array}{} X_2&\overset{f}\longrightarrow&Y_3\\ Y_2&\overset{g}\longrightarrow&X_3 \end{array}$$

Ignoring $X_\omega$ and $Y_\omega$ for the moment, I can rearrange the rest of the diagram to give my a bijection from $X\setminus X_\omega$ to $Y\setminus Y_\omega$:

$$\begin{array}{ccc} X_0&\overset{f}\longrightarrow&Y_1\\ X_1&\overset{g^{-1}}\longrightarrow&Y_0\\ X_2&\overset{f}\longrightarrow&Y_3\\ X_3&\overset{g^{-1}}\longrightarrow&Y_2\\ \vdots&\vdots&\vdots\\ X_{2k}&\overset{f}\longrightarrow&Y_{2k+1}\\ X_{2k+1}&\overset{g^{-1}}\longrightarrow&Y_{2k}\\ \vdots&\vdots&\vdots \end{array}$$

Finally, I claim that $f[X_\omega]=Y_\omega$: everything in $X_\omega$ can be pulled back infinitely often, so everything in $f[X_\omega]$ can be pulled back infinitely often, and therefore $f[X_\omega]\subseteq Y_\omega$. On the other hand, if $y\in Y_\omega$, then $y$ can be pulled back infinitely often, so it must be possible to pull $f^{-1}(y)$ back infinitely often, and therefore $f^{-1}(y)\in X_\omega$. Thus, $Y_\omega\subseteq f[X_\omega]$ as well. The diagram above can now be completed to show a bijection from $X$ onto $Y$:

$$\begin{array}{ccc} X_0&\overset{f}\longrightarrow&Y_1\\ X_1&\overset{g^{-1}}\longrightarrow&Y_0\\ X_2&\overset{f}\longrightarrow&Y_3\\ X_3&\overset{g^{-1}}\longrightarrow&Y_2\\ \vdots&\vdots&\vdots\\ X_{2k}&\overset{f}\longrightarrow&Y_{2k+1}\\ X_{2k+1}&\overset{g^{-1}}\longrightarrow&Y_{2k}\\ \vdots&\vdots&\vdots\\ X_\omega&\overset{f}\longrightarrow&Y_\omega \end{array}$$

The bijection is defined piecewise, but that’s no problem.

There are a few details to be filled in to make this a fully rigorous proof, but I think that it does give a reasonable idea of one possible intuition.

Added: Here’s a very rough sketch. Arrows from left to right are (parts of) $f$, and arrows from right to left are (parts of) $g$.

enter image description here

Related Question