[Math] Understanding the definition of chain

definitionorder-theory

I'm reading Pedersen's 'Analysis Now' at the moment and I'm quite confused about the definition of chain given there. I'm used to the definition that for a partially ordered set $X$ a chain $C$ is a totally ordered subset of $X$. In 'Analysis Now' the definitions are as follows:

An order $\leq$ on $X$ is a transitive, reflexive and antisymmetric binary relation.

I'm ok with that. That's what I know as a partial order but probably only to differentiate it from a total order.

An ordered set $X$ is called well-ordered if every nonempty subset $Y\subseteq X$ has a smallest element.

So for any $x,y\in X$ the set $\{x,y\}$ has a smallest element. Thus a well-ordered set is even totally ordered since all elements are comparable.

A choice function on $X$ is a function $c:\mathcal{P}(X)\setminus\{\emptyset\}\to X$ such that $c(A)\in A$ for all $A\in\mathcal{P}(X)\setminus\{\emptyset\}$.

Fine.

For any subset $Y$ of an ordered set $X$ let $\operatorname{maj}(Y)$ and $\operatorname{min}(Y)$ denote the sets of proper majorants and minorants for $Y$ in $X$.

Now to the definition that confuses me:

Let $(X,\leq)$ be an ordered set and assume that $c$ is a choice function for $X$. A subset $C$ of $X$ is called a chain if it is well-ordered and for each $x\in C$ we have $$c(\operatorname{maj}(C\cap\operatorname{min}\{x\}))=x.$$

As noted above $C$ is totally ordered since it is well-ordered. So the definition of chain given here is probably more restrictive than the usual one. What does the second condition say? $C\cap\operatorname{min}\{x\}$ is the set of all proper minorants of $x$ in $C$. The majorants of this set is at least $C\cap\operatorname{maj}\{x\}\cup\{x\}$ may be more. But why on earth should the choice function on this set be $x$? As I see it the choice function is quite arbitrary.

I'm confused and would be grateful for help!

Best Answer

The quoted material looks like part of a fairly common proof of Zorn's Lemma, so I'll try to explain it from that point of view.

Let me pretend, for a moment, that we know about ordinal numbers and the notion of definition by transfinite induction. Then there is a rather natural way to prove Zorn's Lemma, making use of a choice function $c$ on the (partially) ordered set $X$. Namely, use $c$ to select an element of $X$ (namely $c(X)$). If we're very lucky, this element is maximal in $X$ and we're done. Otherwise, there are strict upper bounds for the chosen element, and we can use $c$ to select one of these. If that's a maximal element, we're done, and otherwise we use $c$ to select a strict upper bound. Continue in this way, transfinitely, always choosing a strict upper bound for all the previously chosen elements. (One needs to verify, by transfinite induction, that the chosen elements form a chain in the usual sense, a totally ordered subset of $X$, so they always have an upper bound, by the hypothesis of Zorn's Lemma, and if this isn't maximal then there will be strict upper bounds to choose from at the next step.) Since the transfinite induction cannot continue through all the ordinals, it must end, and that happens only when it reaches a maximal element.

This argument is, in my opinion, the natural way of proving Zorn's Lemma, but it requires some preparation --- information about ordinals and transfinite induction. So some authors circumvent this preparation, by essentially proving, inside the proof of Zorn's Lemma, just enough about well-ordering and transfinite induction to make the proof work. In other words, they don't prove (or even state) any general principles of transfinite induction but just establish the one instance needed in this proof.

So their goal is to describe, without saying "transfinite induction", the same chain that I described above, using transfinite induction. The result is usually some circumlocution that doesn't explicitly say "transfinite induction" but is best understood by thinking of transfinite induction. Specifically, one ultimately wants a well-ordered chain in which each element $x$ is the chosen (by $c$) strict upper bound for all the elements that precede $x$. And the chain should continue for as long as possible, namely until a maximal element is reached (at which point there are no more strict upper bounds). This "as long as possible" is obtained by taking "approximations" that might stop short, but then forming the union of all these approximations; these approximations are the "chains" in the material quoted in the question, and the complicated definition of "chain" is just saying that each $x$ is $c$'s chosen strict upper bound for its predecessors.

The same approximation technique is used in the general proof that one can define functions by transfinite recursion, but, instead of being isolated in that general principle, it is here mixed with the rest of the argument for Zorn's Lemma, and the mixture is, in my opinion, unnecessarily confusing.

Related Question