[Math] Intuition behind Khinchin’s constant

continued-fractionsergodic-theorygeometric-topologylow-dimensional-topologynumber theory

Khinchin proved that

For almost all reals $r$ with continued fraction representation
$[a_o; a_1, a_2, \dots ]$ the sequence $K_n = \left(\prod_{i=1}^{n} a_i\right)^{1/n}$ converges to a constant $K$ (Khinchin's constant) independent of $r$.

This is quite surprising, and I was wondering if there is anything deeper going on behind the scenes. Is there something deeper? Is there any intuitive explanation of why this must be true?

  • The proof of the result nowadays seems to go through ergodicity of the Gauss/shift map on a measure equivalent to Lebesgue. It is mentioned in the comments that there is an "analogous result" for arithmetic mean of decimals in decimal expansion which feels a lot more intuitive. Perhaps an answer would rephrase or highlight a way to view the standard proof to make it "just as" intuitive. By the way, the analogous result doesn't feel like it should be evidence for Khinchin's result. For example, the numbers in decimal sequences are bounded and it basically says the digits are random while Khinchin's result seems to actually show that $a_n$ are actually controlled a lot, while still being unbounded in general. In fact, the arithmetic mean version for continued fractions diverge almost everywhere.

  • Another approach might be more geometric or topological. For example, continued fraction expansions can be seen as train track expansion sequences on the torus, and the real numbers as curves with specified slopes. Also, there is this nice looking paper The Modular Surface and Continued Fractions by Caroline Series which feel promising and might be on the track of giving a geometric/topological reason and interpretations for $K$ ($K$ isn't explicitly mentioned).

Best Answer

Here is what I think of as the standard explanation. I learned this from The Art of Computer Programming, Section 4.5.3.

Start with some number $z$ whose continued fraction we want to find. I'll review the continued fraction algorithm: Let $a_1 = \lfloor z \rfloor$ and let $r_1 = x-a_1$. Then let $a_2 = \lfloor r_1^{-1} \rfloor$ and let $r_2 = r_1^{-1} - a_2$. Continue in this way to define $r_k$ and $a_k$. The $a_k$ are the coefficients of the continued fraction. Our first task will be to understand the $r_k$.

If $z$ is chosen randomly, then $r_1$ is equally likely to be anywhere in $[0,1)$. But the same is not true of $r_2$. We have $r_2 \in [0,x]$ if and only if $r_1 \in \bigcup_{k=1}^{\infty} [(k+x)^{-1}, k^{-1}]$, so the probability that $r_2$ is in $[0,x]$ is $\sum_{k=1}^{\infty} \left( \tfrac{1}{k} - \tfrac{1}{k+x} \right)$. More generally, let $F_n(x)$ be the probability that $r_n \in [0,x]$, then $$F_{n+1}(x) = \sum_{k=1}^{\infty} \left( F_n(\tfrac{1}{k}) - F_n(\tfrac{1}{k+x} ) \vphantom{\sum} \right).$$

If we imagine that $\lim_{n\to \infty} F_n(x)$ will exist, then the limiting function $F$ should obey $$F(x) = \sum_{k=1}^{\infty} \left( F(\tfrac{1}{k}) - F(\tfrac{1}{k+x} ) \vphantom{\sum} \right).$$

One function that obeys this property is $\log (1+x)$. Let's verify this: $$ \sum_{k=1}^{N} \left( \log \frac{k+1}{k} - \log \frac{k+x+1}{k+x} \right) \sum_{k=1}^{N} \left( \log (k+1) - \log k - \log (k+x+1) + \log (1+x) \right).$$ This telescope to $$\log (N+1) - \log 1 - \log (N+x+1) + \log(1+x) = \log \frac{N+1}{N+x+1} + \log (1+x).$$ Sending $N \to \infty$, we get $\log (1+x)$. This argument was neutral as to what the base of the logarithm was but, since we want $F(1)=1$, we should take log base two.

Of course, I am not going to prove that the limit exists or that it takes this value -- this is a hard theorem! But let's see why it implies the rest. The probability that $a_k = b$ is the probability that $r_{k-1} \in (\tfrac{1}{b+1}, \tfrac{1}{b}]$, or $F_{k-1}(\tfrac{1}{b}) - F_{k-1}(\tfrac{1}{b+1})$. If $F_k$ approaches $\log_2 (1+x)$, then the probability that $a_k$ is $b$ approaches $\log_2 \tfrac{b+1}{b} - \log_2 \tfrac{b+2}{b+1} = \log_2 \tfrac{(b+1)^2}{b(b+2)}$. Note that this is $\approx \tfrac{1}{b^2}$.

Intuitively then, if $g$ is a function from $\mathbb{Z}_{>0}$ to $\mathbb{R}$, we would expect the average value of $g(a_k)$ to be $\sum_{b=1}^{\infty} g(b) \log_2 \tfrac{(b+1)^2}{b(b+2)}$. It wouldn't be a good idea to take $g(x) = x$, since then the sum would diverge. But, if $g(x)=\log x$, then the sum converges to the log of the Khinchin constant, as one would hope for.

There is a subtle switch in the last paragraph. Before the last paragraph, $z$ was a random variable, and I looked at how the probability distribution of $a_k$ behaved as $k \to \infty$. In the last paragraph, and in Khinchin's theorem, I took a single $z$ and asked about the average behavior of $a_k$ for that $z$. This is a hard issue, but ergodic theory has standard tools to justify it.

Related Question