Limit associated with a recursion, connection to normality of quadratic irrationals

irrational-numberslimitsnumber theoryrecurrence-relationssequences-and-series

Update on 3/2/2020. All the material below and much more has been incorporated into a comprehensive article on this topic. The question below is discussed in that article, entitled "State-of-the-Art Statistical Science to Tackle Famous Number Theory Conjectures", and available here.

I posted a popular question 5 months ago about the following recursion, see here.

If $z_n < 2y_n$ Then

  • $y_{n+1} = 4y_n – 2z_n$
  • $z_{n+1} = 2z_n + 3$
  • $d_{n+1}=1$

Else

  • $y_{n+1} = 4y_n$
  • $z_{n+1} = 2 z_n – 1$
  • $d_{n+1}=0$

Back then, I wrote:

The sequence $d_n$ represents the binary digits of some unknown number
$x$, a number that depends on the initial conditions. It
turns out that if $y_1=2,z_1=5$ then that number is $x=\sqrt{2}$.

Here I offer a full solution and a potential path to proving the normality of quadratic numbers. My question is about proving that my main result (below) is correct. It is backed by very strong empirical results involving computations with thousands of digits. By normality, I mean that 50% of the binary digits of $x$ are equal to 1. This is one of the most challenging unsolved mathematical conjectures of all times.

Below is a Perl script that does all the computations. It uses the Bignum library to perform exact arithmetic (computation of millions of binary digits for each number, using the formulas described here.) The variable called number in the code corresponds to $x$.

use strict;
use bignum;

my $y;
my $z;
my $u;
my $v;
my $k;
my $c;
my $even;

my $counter;
my $seed_y;
my $seed_z;
my $number;
my $denominator;
my $c1;
my $c2;

$counter=0;

open(OUT2,">collatzr.txt"); # summary stats
open(OUT,">coll2.txt");     # details and digits for each number


for ($seed_y=1; $seed_y<=5; $seed_y++) {
for ($seed_z=$seed_y; $seed_z<=10; $seed_z++) {


  $y=$seed_y;
  $z=$seed_z;
  $u=2*$y-$z; 
  $v=2*$z+3;

  $number=0;
  $denominator=1;
  $c1=0;
  $c2=0;

  for ($k=1; $k<200; $k++) { # compute 200 digits

    if ($u>0) { 
      $even=1;  # digit equal to 1
      $c1++;
      $y=4*$y-2*$z;
      $z=2*$z+3;
      $u=4*$u-$v;
      $v=2*$v+3;
    } else {
      $even=0;  # digit equal to 0
      $c2++;  
      $y=4*$y; 
      $z=2*$z-1;
      $u=4*$u+$v-2;
      $v=2*$v-5;  
    }            
    print OUT "$seed_y\t$seed_z\t$k\t$even\n"; 
    $denominator=$denominator/2;  
    $number=$number+$even*$denominator;
    $c=$z*$denominator;
  }

  $counter++;
  if ($counter%5 == 0) { print "$seed_y\t$seed_z\n"; select()->flush(); }
  print OUT2 "$seed_y\t$seed_z\t$c1\t$c2\t$c\t$number\n";


}
}

close(OUT);
close(OUT2);

1. Main result

Let
$$x = \sum_{k=0}^\infty \frac{d_k}{2^k}, \mbox{ with } d_0=0 \tag 1$$

Then, assuming $y_0, z_0$ are positive integers, we have:

  • $y_0=0 \Rightarrow x=0$
  • $z_0 = 2y_0 \Rightarrow x=\frac{1}{2}$
  • $z_0 < y_0 \Rightarrow x=1$

In all other cases (referred to as the standard case), $x$ is an irrational quadratic number solution of $2x^2 +(z_0-1)x -y_0=0$, more specifically:

$$x =\frac{-(z_0-1)+\sqrt{(z_0-1)^2+8y_0}}{4} \tag 2$$

My question

Can you prove the above result? It was obtained empirically.

2. Useful tips to answer my question

In the standard case, we have the following result (not proved yet):
$$\lim_{n\rightarrow\infty}\frac{z_n}{2^n}=\sqrt{(z_0-1)^2 + 8y_0}.$$

Also, using $u_n=2y_n-z_n$ and $v_n = 2z_n+3$, the recurrence can be rewritten as:

If $u_n>0$ Then

  • $u_{n+1}=4u_n -v_n$
  • $v_{n+1} = 2v_n + 3$
  • $d_{n+1}=1$

Else

  • $u_{n+1}=4u_n + v_n-2$
  • $v_{n+1} = 2v_n-5$
  • $d_{n+1}=0$

Finally, $\mbox{mod}(v_n, 8) = 5$, that is, $(v_n – 5)/8$ is an integer. If $n>1$ we have:

$$d_n = \mbox{ mod}\Big(\frac{v_n-5}{8},2\Big).$$
This leads to the following simple reverse recurrence involving only one variable, allowing you to compute the digits of $x$ backward, starting with a large $n$ and moving backward down to $n=1$:

$$\mbox{If mod}\Big(\frac{v_{n}-5}{8}, 2\Big) = 1, \mbox{ then } v_{n-1}=\frac{v_{n}-3}{2}, d_{n}=1, \mbox{ else } v_{n-1}=\frac{v_{n}+5}{2}, d_{n} = 0.$$

The very tough problem, outlined in the next section, is to prove that each of these two outcomes is equally likely to occur, on average. This would indeed be true if each $v_n$ is arbitrary, but this is not the case here. Also, if for some large $n$, we have $d_n=1$, then a run of $R$ successive digits $d_{n-1},\dots,d_{n-R}$ all equal to zero can only go so far, unless $v_n$ is a very special number not leading to $x$ being irrational. Maybe $R=\lfloor 2\sqrt{n}\rfloor$ is an upper bound. This is something worth exploring.

Property of the reverse recurrence: If $\mbox{mod}(v_n,8)=5$ and $v_n
> 5$
, then the sequence $v_n, v_{n-1},\dots$ is strictly decreasing until it reaches $5$ and stays there permanently; also each term is congruent to $5$ modulo $8$. This is true whether or not $v_n$ was generated using our forward recurrence.

An interesting application of this property is as follows. Take an arbitrary number, say $x = \log 2$. Multiply by a large power of $2$, say $2^{30}$. Round the result to the closest integer congruent to $5$ modulo $8$, and let this be your $v_n$. In this case, $v_n =\lfloor 2^{30} \log 2 \rfloor$. Compute $v_{n-1}, v_{n-2}$ and so on, as well as the associated digits, using the reverse recurrence. Stop when you hit $5$. The digits in question are the first binary digits of $\log 2$ yielding the approximation $0.693147175\dots$ while the exact value is $0.693147180\dots$

A similar reverse recurrence is also available for the original system:
If $\mbox{mod}(\frac{z_{n}-1}{4}, 2) = 1$, then $z_{n-1}=\frac{z_{n}-3}{2}$, $d_{n}=1$, else $z_{n-1}=\frac{z_{n}+1}{2}$, $d_{n} = 0$. We also have $\mbox{mod}(z_n,4)=1$.

3. Connection to normality of irrational quadratic numbers

This is not part of my question, just interesting, extra material for the curious reader, and to provide some background as to why I am interested in this recursion. Do not even try to solve my problem below: contrarily to the main result, this stuff is incredibly hard; it could keep you busy and depressed for many years.

Let $S_n$ denotes the number of binary digits $d_k$ of $x$, that are equal to 1, for $k=1,\cdots, n$. If irrational quadratic numbers were indeed normal as we all believe they are, then, there is an absolute constant $K$ (not depending on $x$), and for each $x$, there is a number $N(x)$ denoted as $N$, such that

$$\mbox{If } n > N, \mbox{ then } S_n – K\sqrt{n} \leq \frac{n}{2} \leq S_n + K\sqrt{n} \tag 3$$

This is a consequence of the Berry-Hessen theorem applied to Bernouilli variables. It is discussed in sections 1.1 and 1.2 in this article. The chart below shows $\frac{|2S_n – n|}{\sqrt{n}}$ on the Y-axis, with $n$ between 0 and 530,000 on the X-axis, for the case $y_0 = z_0 = 1$ leading to $x=\frac{\sqrt{2}}{2}$. It suggests (not a proof) that in this case, $N = 0$ and $K = 0.90$ might work.

enter image description here

To prove that $x$ has 50% of its binary digits equal to 1, a potential approach thus consists in proving that if the previous inequality is true for $n$ large enough, then it is also true for $n+1$, by looking at the worst case scenario for the potential distribution of the first $n$ binary digits of $x$, using the recurrence relation introduced at the beginning, or the backward recurrence.

Some of the numbers $x$ that I tested are approaching the 50% ratio in question rather slowly, for instance if $y_0=1, z_0=16$. Indeed, I am wondering if some of these quadratic irrationals, maybe a finite number of them, even though normal, do not satistfy $(3)$, but instead a weaker result, say with $\sqrt{n}$ replaced by $n^{3/4}$ or $\frac{n}{\log n}$. To the contrary, a very fast convergence, say $n^{1/4}$ or $\log n$ instead of $\sqrt{n}$ in $(3)$, would also mean, even though $x$ may be normal, that its digits are not distributed like i.i.d. Bernouilli$(\frac{1}{2})$ variables. The only way for this Bernouilli behavior to happen, is if the convergence to the 50% ratio occurs at the right speed, that is with $\sqrt{n}$ in inequality $(3)$. In other words, for a specific $x$, any asymptotic departure from $\sqrt{n}$ in $(3)$ would mean that its binary digits are not distributed in a purely random way. This "pure randomness" criterion is stronger than having 50% of the digits equal to 1. For instance, $x=\frac{2}{3}=0.10101010\dots$ (in base 2) has 50% of its digits equal to 1, but the term $O(\sqrt{n})$ in $(3)$ can be replaced by the optimum bound $O(1)$, and the digits look anything but random.

I am doing some simulations and testing at this moment, see for instance my recent question on CrossValidated, here. Another spectacular result that might be more easier to prove is that the correlation between the binary digits of $px$ and $qx$ is equal to $\frac{1}{pq}$ if $p, q$ are odd, co-prime, non-zero integers: see here. A corollary is that if $\alpha,\beta$ are irrationals linearly independent over the set of rational numbers, then their binary digits have zero cross-correlation.

Best Answer

The proof follows rather easily from all the data you gathered. As often with recurrences, the core idea is to realize that if the conjecture holds for $y_0,z_0$ it holds for $y_n,z_n$ as well, and deduce new, non-trivial consequences from that.

Generalizing your formula for $x$, let us put

$$ x_n =\frac{-(z_n-1)+\sqrt{(z_n-1)^2+8y_n}}{4} \tag{1} $$

As you already computed, $x_n$ is a root of $P_n=x^2 +(z_n-1)x -y_n$. If your conjecture is correct (and it is, as will be shown soon), $x_n$ should be in $[0,1]$. It turns out that this is true because of the form of $P_n$.

Lemma 1. For every $n$, $P_n$ is increasing on $[0,1]$, and satisfies $P_n(0) \lt 0 \lt P_n(1)$.

Proof of lemma 1 : Since $P'_n(0)=z_n-1$, $P_n(0)=1-y_n$ and $P_n(1)=z_n+1-y_n$, it suffices to show that $z_n\geq 1,1-y_n\lt 0 \lt z_n+1-y_n$ for all $n$. This is straightforward by induction on $n$.

Let $\delta_n$ be the second digit in the dyadic expansion of $x_n$ (it will soon turn out that $\delta_n$ is the same thing as your $d_{n+1}$). We wish to know if $\delta_n$ is zero or $1$, in other words whether $x_n$ is smaller or larger than $\frac{1}{2}$, or what is the sign of $P_n(\frac{1}{2})$.

But

$$ P_n(\frac{1}{2})=\frac{z_n-2y_n}{2} \tag{2} $$

Now you know where your comparison of $z_n$ to $2y_n$ comes from ! (2) also shows that $\delta_n=d_{n+1}$. Furthermore, a purely algebraic verification shows that the recursion on $y_n$ and $z_n$ is equivalent to

$$ P_{n+1}(x)=4P_{n}\bigg(\frac{\delta_n + x}{2}\bigg) \tag{3} $$

Finally, it follows from (3) that

$$ x_n=\frac{\delta_n+x_{n+1}}{2} \tag{4} $$

and hence

$$ x_n=\sum_{j=n}^{\infty} \frac{\delta_j}{2^j} \tag{5} $$

This finishes the proof.