Extending Apéry’s Proof – Catalan’s Constant

continued-fractionsirrational-numbersnt.number-theoryriemann-zeta-function

I've been looking into Apéry's irrationality proof of $\zeta (3)$, and one of the first questions I instantly had, was how did he derive the following continued fraction?
$$\begin{equation*} \zeta (3)=\dfrac{6}{5+\overset{\infty }{\underset{n=1}{\mathbb{K}}}\dfrac{-n^{6}}{34n^{3}+51n^{2}+27n+5}}\end{equation*}$$

Furthermore, is it possible to get a similar continued fraction for $\zeta(5)$, $\zeta(7)$ or $G$?

A rapidly converging central binomial series was recently found for Catalan's constant:

$$G = \frac{1}{2} \sum_{n=0}^{\infty} (-1)^n \frac{(3n+2) 8^n}{(2n+1)^3 \binom{2n}{n}^3}$$
which is in similar spirit to
$$\zeta(3)=\frac{5}{2}\sum_{n=1}^{\infty}\frac{(-1)^{n-1}}{n^3 \binom{2n}{n}}$$
which Apéry utilised. If I understand correctly, this gives us the first stage of an analogue of Apéry's proof for $G$. The next stage in his proof is to use a fast recursion formula that approaches $\zeta(3)$:
$$n^3 u_n + (n-1)^3 u_{n-2} = (34n^3 – 51n^2 + 27n – 5) u_{n-1},\, n \geq 2$$
A similar recursive formula that approaches $G$ was found:
$$(2n+1)^2 (2n+2)^2 p(n) u_{n+1}-q(n)u_n = (2n-1)^2 (2n)^2 p(n+1) u_{n-1}$$
where
$$p(n) = 20n^2 – 8n + 1$$
$$q(n) = 3520n^6 + 5632n^5 +2064n^4-384n^3-156n^2+16n+7$$
Now as Zudilin mentions in his paper, 'the
analogy is far from proving the desired irrationality of $G$', but why exactly is this recursive formula not good enough to prove the irrationality of $G$?

A major part of Apéry's proof is to define the double sequence:
$$c_{n,k} := \sum_{m=1}^{n} \frac{1}{m^3} + \sum_{m=1}^{k} \frac{(-1)^{m-1}}{2m^3 \binom{n}{m} \binom{n+m}{m}}$$

Where does this come from, and how could one create a similar sequence instead that follows from the above binomial series for $G$?

Best Answer

Summary:

  • The continued fraction, the recurrence and the explicit form of the sequence are interchangeable and for the Apéry numbers, we don't know what come first. This extend to other constructions for other constants.

  • The approximation for the Catalan's constant $G$ fails because it doesn't converge too fast. i.e the growth ratio between the inclusion function and the sequence diverges (or when raising it to the power of $\frac{1}{n}$ and taking limit to $\infty$, is $> 1$)

This is in the spirit of Fischler's expository article, Zudilin's paper and Zeilberger-Zudilin paper on automatic discovery of proofs of irrationality. I will try to be rigorous to the extent possible for my knowledge.

First, we will modify a bit the criteria for irrationality. How the criteria for irrationality that appears in Poorten's paper implies the criteria that appear in these papers? Let's recall it. Let $\beta$ a real number. If $\frac{p_{n}}{q_{n}}$ is a rational sequence with $\frac{p_{n}}{q_{n}} \neq \beta$ and

$$\left|\beta-\frac{p_{n}}{q_{n}}\right| < \frac{1}{q_{n}^{1+\delta}}$$

then $\beta$ is irrational.

If you multiply by $q_{n}$

$$\left|q_{n}\beta-p_{n}\right| < \frac{1}{q_{n}^\delta}$$

with $0<\delta<1$ that exists. If we make the same procedure (power to $\frac{1}{n}$ and then taking limit), for these kind of sequences, the right side simply yields a constant (e.g $\mu$) to the power of $\delta$.

$$\lim_{n \rightarrow \infty} \frac{1}{q_{n}^{\frac{1}{n}\delta}}=\frac{1}{\mu^{\delta}}<1$$

The last part is just because we need only a $\delta$ to exist, and it could a very small one, close to 0, but no 0.

In summary, the criteria get translated into

$$\lim_{n \rightarrow \infty}\left|q_{n}\beta-p_{n}\right|^{\frac{1}{n}}=L<1$$

Second, it is necessary to mention that the explicit form of a sequence, its recurrence relation, its continued fraction, integral representations, etc. are all complementary descriptions of the sequence and each of them help us to prove the ingredients of a Apéry-like proof, namely:

  1. The existence of a linear form over two sequences satisfying the previous irrationality criteria
  2. The existence of an "inclusion" for the sequences, i.e, multiplying the sequences by a function give integer sequences
  3. Our approximation is different to 0 for infinite $n$.

Point 1) requires to know asymptotics estimates of the sequences and for this purpose the recurrence description is useful. Also we need to estimate the error between the number and our approximation. Again you can use the recurrence or an integral representation (like in Beukers's proof). Point 2) is generally proved using the explicit form of the sequences. And for Point 3) you have many options. As you can see, it's flexible and depends of the nature of the sequence.

The technical considerations are a) The asymptotics of a sequence given by a holonomic recurrence can be obtained using the algorithm in Wimp and Zeilberger paper or the method of Flajolet and Sedgewick presented in the book "Analytic Combinatorics". (Some cautious here, and indeed I would like to consider this in another post. The Wimp-Zeilberger algorithm is an explanation of the Birkhoff-Trjitzinsky method, which hasn't been verified completely. Also, since calculations are complicated, we require of a algorithmic package to find the terms, like Kauer's Mathematica package. It doesn't mean that some previously given asymptotics are wrong, but we need to check further details). These are the reasons why we will avoid calculating an full asymptotics (even if we can) and we will deal with expressions of the kind $\lim_{n \rightarrow \infty}a_{n}^{\frac{1}{n}}$ , that can be tackle using the Poincaré-Perron theorem plus some assumption of minimality of the solutions plus this paper of Perron, also knows as Perron's second theorem. Modern approaches to the proof use this setup to avoid more technicalities. For a summary of the full asymptotic of the Apéry numbers see this, page 379.

b) it's a bit more tricky and it's required to calculate the $p$-adic valuation of sub-terms in the explicit form, which generally are combination of powers and binomial coefficients. Here also we need the inclusion function to be smaller in comparison to the sequence's asymptotics, to reach point 1).

Let's try some examples with $\zeta (3)$ and the Catalan's constant to visualize this.

For $\zeta(3)$, the recurrence is given by

$$n^3 u_n -(34n^3 - 51n^2 + 27n - 5) u_{n-1}+ (n-1)^3 u_{n-2} = 0\quad n \geq 2$$

To use the Poincaré-Perron (PP) theorem, we need to transform it into a form $u_{n}+a(n)u_{n-1}+b(n)u_{n-2}=0$ with

$$a=\lim_{n \rightarrow \infty} a(n) \quad b=\lim_{n \rightarrow \infty} b(n)$$

Dividing by $n^3$, we obtain

$$u_n -(34 - 51n^{-1} + 27n^{-2} - 5n^{-3}) u_{n-1}+ (1 +3n^{-1}+3n^{-2}+n^{-3}) u_{n-2} = 0\quad n \geq 2$$ Then, we calculate the roots of a characteristic polynomial $x^2+ax^2+b=0$, i.e, $x^2-34x^2+1=0$, and the conclusion of the theorem is that if $x_{n}$ and $y_{n}$ are individual independent solutions of the recurrences, then

$$\lim_{n \rightarrow \infty}\frac{x_{n+1}}{x_{n}}=\lambda_{1} \quad \lim_{n \rightarrow \infty}\frac{y_{n+1}}{y_{n}}=\lambda_{2}$$

where $\lambda_{1}$ and $\lambda_{2}$ and the roots and $|\lambda_{1}|\leq|\lambda_{2}|$. This alone doesn't give asymptotic estimates of the general solutions, so we will use some improvements of the theorem. If $a(n)\sim n^{C_{1}}$ , $b(n)\sim n^{C_{2}}$ and $C_{1}=C_{2}$, then the solution given by $y_{n}$ is minimal (page 370), in the sense that for any other independent solution of the recurrence, we can call it $h_{n}$ (i.e $h_{n}$ is not a multiple of $y_{n}$ for some initial conditions), we have: $$\lim_{n \rightarrow \infty}\frac{y_{n}}{h_{n}}=0$$ In this case, we can use the following estimates,

$$\lim_{n \rightarrow \infty} a_{n}^{\frac{1}{n}} = \lim_{n \rightarrow \infty} b_{n}^{\frac{1}{n}}=17+12\sqrt{2}=\alpha$$

In the case of $G$, the recurrence is

$$(2n+1)^2 (2n+2)^2 p(n) u_{n+1}- q(n)u_n - (2n-1)^2 (2n)^2 p(n+1) u_{n-1}=0$$

We have a similar limit, if we divide by $(2n+1)^{2}(2n+2)^2 p(n)$ and then taking limits on $a(n)$ and $b(n)$, we arrive to the characteristic polynomial $x^2-11x-1=0$ and therefore we obtain

$$\lim_{n \rightarrow \infty} a_{n}^{\frac{1}{n}} = \lim_{n \rightarrow \infty} b_{n}^{\frac{1}{n}}=\frac{11}{2} + \frac{5 \sqrt{5}}{2}=\delta$$ From the recurrence, you can also estimate the error between the number and our approximation. In the case of $\zeta(3)$, using the recurrence, we have: $$a_{n}b_{n-1}-a_{n-1}b_{n}=\frac{6}{n^{3}}$$ $$\left|\zeta (3)-\frac{a_{n}}{b_{n}}\right| = \sum_{k=n+1}^{\infty}\frac{6}{n^{3}b_{k}b_{k-1}}$$

For $G$ we will find a similar expression since:

$$(2n+1)^2 (2n+2)^2 p(n)(a_{n+1}b_{n}-a_{n}b_{n+1})=-(2(n-1)+1)^2 (2(n-1)+2)^2 p(n+1)(a_{n}b_{n-1}-a_{n-1}b_{n})$$ $$=+(2(n-2)+1)^2 (2(n-2)+2)^2 p(n+2)(a_{n-1}b_{n-2}-a_{n-2}b_{n-1})$$ $$=\textrm{.....}$$ $$\therefore a_{n+1}b_{n}-a_{n}b_{n+1}=\frac{13}{32}(-1)^{n}\frac{p(2n)}{(2n+1)^2 (2n+2)^2 p(n)}$$

and a similar bound holds. Now you have to prove the inclusions, for $\zeta (3)$ the inclusion is given by $2D_{n}^{3}a_{n} \in \mathbb{Z}$, and for $G$ Zudilin proved the inclusions $$2^{4n+3} D_{n} b_{n} \in \mathbb{Z} \quad 2^{4n+3}D_{2n-1}^3 a_{n} \in \mathbb{Z}$$ where $D_{n}=\textrm{lcm}(1,2,....,n)=e^{\psi(n)}$ ($\psi$(n) is the Chebyshev $\psi$ function). So, for $G$, if we multiply $a_{n}$ and $b_{n}$ by $2^{4n+3}D_{2n-1}^3$, they become integers. Note: Trivially we have that $D_{n}|D_{2n-1}^3$. We will define the inclusion of $a(n)$ as $\textrm{Incl}_{a,n}$ (or by analogy, $\textrm{Incl}_{b,n}$ for $b_{n}$) to be the function that we need to multiply to become $a_{n}$ (or $b_{n}$) an integer. I won't give too much detail on this step, but the proof of both of them relies on the technical argument explained before (the explicit form sometimes yields this easily by means of hypergeometric manipulations)

Finally, we have that

$$\left|q_{n}\beta-p_{n}\right|<\frac{Cq_{n}}{b_{n}^{2}}$$

Since for these sequences, we have $q_{n}=b_{n}\textrm{Incl}_{a,n}$

$$\left|q_{n}\beta-p_{n}\right|<\frac{C\textrm{Incl}_{a,n}}{b_{n}}$$

Let's define $\phi$ $$\phi=\lim_{n \rightarrow \infty}\textrm{Incl}_{a,n}^{\frac{1}{n}}$$

For $\zeta(3)$ $$\phi=\lim_{n \rightarrow \infty}e^{3\frac{\psi(n)}{n}}=e^{3}$$

by the prime number theorem $\lim_{n \rightarrow \infty}\frac{\psi(n)}{n}=1$

Thus, if you raise the approximation to the power of $\frac{1}{n}$ and them apply the limit to $\infty$, the behaviour of the right side if governed by $\frac{\phi}{\alpha}$ (or $\delta$ in denominator, in the case of $G$). If this ratio is bigger that one, the right bound is not useful. If it less that 1, we can prove irrationality.

For $\zeta(3)$, $\frac{e^3}{\alpha} \approx 0.59$ and for $G$, $\frac{e^6 2^4}{\gamma}\approx 582$. The huge difference in the ratios is what Zudilin indicated. Since for $G$ the value is bigger that one, therefore the sequence is not useful to prove irrationality. Even if we use the conjectured inclusions in the conclusion remarks, we obtain $\frac{e^4 2^4}{\gamma}\approx 79$. This illustrates very well the interplay between the approximation, the asymptotics of the inclusion function and the asymptotics of our sequence.

In the literature, you can find many approaches to construct these sequences: first you can work with an explicit form of the sequences (many of them are obtained from hypergeometric identities), and then you guess a recurrence for them (using the Gosper-Zeilberger algorithm; see this). If you have the recurrence, the continued fraction description is immediate. Or viceversa, you try to find recurrences that gives origin to integer or "quasi-integer" sequence, by searching the space of coefficients in the recurrence and then prove a closed form for the sequences using combinatorial techniques. Therefore, we still don't have a general way to come up with sequences like the Apéry numbers, i.e, we're looking for "well-poised" sequences in the sense of they almost satisfy the three ingredients, but the seed or the intuition to know where to look comes from the researcher. So the answer to your questions are: where does the continued fraction comes? From the recurrence. Where does the recurrence comes? From the explicit form with the $c_{n,k}$ of Poorten's paper. Where does the explicit form with the $c_{n,k}$ comes? From combinatorial identities and a bit of luck. You can start wherever you want, for example see this, specially Remarque 2.

The interesting question is of course, what is common in the sequences that allow us to prove the irrationality of $\zeta(2)$ and $\zeta(3)$? This question is deep, and goes back to Zagier's paper: the generating functions of the sequences that are used in the proofs admitted a modular parametrization, a fact that was proved rigorously by Beukers and is related also to the differential equation satisfied by the generating function (Picard-Fuchs equations and variants). Some sequences that are "well-poised" in the sense of they almost satisfy the three ingredients don't possess this modular behaviour. In contrast, some family of sequences that arise in the massive search of recurrences exhibit this phenomenon. Surprisingly, the latter give sequences that satisfy the point 2) almost immediately with small inclusions, which is not trivial, and there's still work in progress to understand why. Also some of them satisfy the condition 1) qualitatively, depending of which constant are approximating, of course. But they fail in the final calculation of the irrationality criteria. For more details, see this and this post.

I'm in the combinatorial side, so an expert in the work of Zagier and Beukers can complement this part better.

Comment: In many parts, I'm using the positivity, rationality and monoticity of the sequences to simplify the argument. Point 3) is in general easy to prove.