[Math] Does Borel’s proof for existence of normal numbers make an essential use of axiom of choice

axiom-of-choicemeasure-theorynt.number-theorypr.probabilityset-theory

A normal number is a real number whose infinite sequence of digits in every base $b$ is distributed uniformly in the sense that each of the $b$ digit values has the same natural density $\frac{1}{b}$, also all possible $b^2$ pairs of digits are equally likely with density $\frac{1}{b^2}$, all $b^3$ triplets of digits equally likely with density $\frac{1}{b^3}$, etc.

Using the Borel–Cantelli lemma, Borel proved the normal number theorem:

Theorem. Almost all real numbers are normal, in the sense that the set of non-normal numbers has Lebesgue measure zero.

But counterintutively we know just a very few number of such normal real numbers. This is mainly because Borel's proof is not constructive.

My questions are:

1- Does Borel's proof use axiom of choice? If yes, is this use really essential? In the other words, does Borel's theorem hold even when axiom of choice fails or it possible to have few normal real numbers, say a set of Lebesgue measure zero for such numbers?

2- Is Borel's proof really non-constructive? If yes, what is the source of non-constructiveness in Borel's proof for existence of normal numbers? Axiom of choice or something else?

Best Answer

The fact that almost all reals are normal in base $b$ is an application of the Strong Law of Large Numbers. Write the base $b$ expansion as $X = \sum_{i=1}^\infty b^{-i} D_i$ where $X$ is uniform on $[0,1]$ and $D_i$ are iid with $P(D_i = y) = 1/b$ for $y = 0,1,\ldots,b-1$. The Strong Law of Large Numbers applied to the indicators $I(D_i = y)$ says that for each $y = 0,1,\ldots, b-1$, $$\lim_{N \to \infty} \dfrac{1}{N} \sum_{i=1}^N I(D_i=y) = \dfrac{1}{b} $$ The proof (e.g. using fourth moments) makes no use of Axiom of Choice, as far as I can see. If $S_N = \sum_{i=1}^N (I(D_i=y) - 1/b)$, you use independence and Cauchy-Schwarz to get a bound $\mathbb E[S_N^4) \le K N^2$, which implies $\mathbb E \sum_N (S_N/N)^4 \le \sum_N K/N^2 < \infty$, and therefore $\sum_N (S_N/N)^4 < \infty$ almost surely, and in particular $S_N/N \to 0$ almost surely.

EDIT: To make this more explicit, we can say $$\mathbb P\left(\dfrac{|S_N(b,y)|}{N} > \epsilon\right) < \dfrac{K(b)}{\epsilon^4 N^2}$$ Moreover, the event $|S_N(b,y)|/N > \epsilon$ is equivalent to $X$ belonging to an explicit union of intervals in $[0,1]$ of total length $< K(b)/(\epsilon^4 N^2)$. We can construct a function $N(\eta, \epsilon, b)$ for $\eta, \epsilon > 0$ so that $$\sum_{N > N(\eta, \epsilon, b)} \mathbb P\left(\dfrac{|S_N(b,y)|}{N} > \epsilon\right) < \eta$$

If we take $\epsilon = 1/M$ and $\eta = 2^{-b-M} \delta/b$, we get

$$ \sum_{M=1}^\infty \sum_{b=2}^\infty \sum_{y=0}^{b-1} \sum_{N > N(1/M, 2^{-b-M} \delta/b)} \mathbb P\left(\dfrac{|S_N(b,y)|}{N} > \dfrac{1}{M} \right) < \delta $$

Thus we have an explicit sequence of intervals in $[0,1]$ with total length $< \delta$, and every point in the complement of the union of those intervals is normal in every base. For convenience, let's reindex these intervals as $J_n$, $n = 1,2,3,\ldots$.

Now, how do we arrange (without Axiom of Choice) to find points in $G = [0,1] \backslash \bigcup_n J_n$? One way is to construct a nested sequence of closed intervals $G_n$ as follows. We will ensure that $m(G_n \cap G) > (1-(2-2^{-n})\delta) m(G_n)$ and $G_n \cap \bigcup_{j\le n} J_j = \emptyset$. Then the limit of the left endpoints of $G_n$ as $n \to \infty$ is in $G$, as desired.

Start with the interval $G_0 = [0,1]$.
Given $G_{n-1}$, $G_{n-1} \backslash J_n$ is either a single interval $A$ (plus perhaps a single point, which we ignore) or the union of two intervals $A \cup B$. If it's a single interval $A$, we have $m(A \cap G) = m(G_{n-1} \cap G) > (1-(2-2^{-n+1}) \delta) m(A)$. If it's two intervals, $m(A \cap G) + m(B \cap G) > (1-(2-2^{-n+1}) \delta) (m(A) + m(B))$, so

$$\max\left(\dfrac{m(A \cap G)}{m(A)}, \dfrac{m(B \cap G)}{m(B)}\right) > 1-(2-2^{-n+1}) \delta$$ and we choose whichever of $A$ and $B$ gives the largest quotient here (in case of a tie, for definiteness take the one on the left). We want $G_n$ to be closed, so take a slightly smaller closed interval, still maintaining $m(G_n \cap G) > (1 - (2 - 2^{-n})) \delta$. This gives the desired construction.