Collatz Conjecture: Understanding the Chain Equation (2.1) in the proof by Simons & de Weger (2003)

collatz conjectureproof-explanation

I am trying to understand the observations that make up the Chain Equation (2.1) from this paper by Simons & de Weger (2003).

I am not clear on the first two observations in the statement of the chain equation.

Let:

  • $n$ be a natural number.

  • $T(n) = \begin{cases}
    \frac{1}{2}(3n + 1), && \text{if }n\text{ is odd}\\
    \frac{1}{2}n, && \text{if }n\text{ is even}\\
    \end{cases}$

  • sequence be an increasing subsequence of odd integers followed by a decreasing subsequence of even integers

  • a cycle be an $m$cycle if it consists of $m$ sequences with a total of $K$ odd numbers and a total of $L$ even numbers.

  • a non-trivial cycle be any cycle that contains natural numbers greater than $2$.

  • a sequence is periodic if there exists an integer $p \ge 1$ in the sequence $\{ n, T(n), T^2(n), \dots, T^{p}(n) \}$ where:

    • $T^0(n) = n$
    • $T^{i+1}(n) = T(T^i(n))$
    • $T^p(n) = n$
  • $t_0, t_1, \dots, t_{m-1}$ be the indices of the $m$ local minima in an $m$-cycle such that:

    • $t_0 = 0$
    • $t_0 < t_1 < t_2 < \dots < t_{m-1} < p$
  • $s_0, s_1, \dots, s_{m-1}$ be the indices of the $m$ local maxima in an $m$-cycle such that:

    • $t_0 < s_0 < t_1 < s_1 < \dots < t_{m-1} < s_{m-1} \le p-1$
  • $x_i, y_i$ be the values of the local minima and maxima so that:

    • $x_i = T^{t_i}(n)$
    • $y_i = T^{s_i}(n)$
  • $k_i, l_i$ be defined so that:

    • $k_i = s_i – t_i$ for $i = 0, \dots, m-1$
    • $l_i = t_{i+1} – s_i$ for $i = 0, \dots, m-2$ and $l_{m-1} = p + t_0 – s_{m-1}$
    • $K = \sum\limits_{i=0}^{m-1}k_i$
    • $L = \sum\limits_{i=0}^{m-1}l_i$

I am unclear on Observation 1 and Observation 2 relating to the chain equation. I am clear on Observation 3 and Observation 4.

Observation 1: $x_i = 2^{k_i}a_i – 1$ for some $a_i \ge 1$

  • Since $x_i$ is odd, there exists $u$ such that $x_i = 2u + 1 = 2(u+1)-1$

  • $k_i = s_i – t_i$ where $s_i$ is the index of the local maxima and $t_i$ is the index of the local minima.

It is not clear to me how we can be sure that $k_i$ is the power of $2$ that applies.

Observation 2: $y_i = 3^{k_i}a_i – 1$

  • If I understand correctly, then $y_i$, the value of the maxima is also odd.

  • To show my confusion, let's assume that $y_i = \frac{1}{2}(3x_i + 1)$ which applying Observation 1 gives:

$$y_i = \frac{1}{2}(3(2^{k_i}a_i – 1) + 1) = 3\cdot2^{k_i-1}a_i – 1$$

  • Which suggests that the $y_i = 3^{k_i}2^u a_i – 1$ but not $y_i = 3^{k_i}a_i – 1$. Does this imply that $a_i$ in Observation 2 is different than the $a_i$ from Observation 1?

I am not clear how $a_i$ is the same value in both observations.

Observation 3: $y_i = 2^{l_i}x_{i+1}$

I am clear on this observation.

Observation 4: The Chain Equation: $3^{k_i}a_i – 1 = 2^{k_{i+1}+l_i}a_{i+1} – 2^{l_i}$

I am clear on the Chain Equation. Here's my reasoning.

  • Here's what I get:

$$3^{k_i}a_i – 1 = 2^{l_i}x_{i+1}$$

So that:

$$3^{k_i}a_i – 1 = 2^{l_i}2^{k_{i+1}}a_{i+1} – 2^{l_i} = 2^{k_{i+1}+l_i}a_{i+1} – 2^{l_i}$$

Best Answer

For some integers $z_i$ and $a_i$, we have

$$x_i \equiv z_i \pmod{2^{k_i}} \implies x_i = 2^{k_i}a_i + z_i \tag{1}\label{eq1A}$$

Next, there are $k_i$ odd integer results in a row after repeated applications of the $T$ function starting with $x_i$. This gives for the first one,

$$\begin{equation}\begin{aligned} T^{1}(x_i) & = \frac{3x_i + 1}{2} \\ & = \frac{3(2^{k_i}a_i + z_i) + 1}{2} \\ & = \frac{3(2^{k_i}a_i) + 3(z_i) + 1}{2} \end{aligned}\end{equation}\tag{2}\label{eq2A}$$

The next one becomes

$$\begin{equation}\begin{aligned} T^{2}(x_i) & = \frac{3T^{1}(x_i) + 1}{2} \\ & = \frac{3\left(\frac{3(2^{k_i}a_i) + 3(z_i) + 1}{2}\right) + 1}{2} \\ & = \frac{\frac{3^2(2^{k_i}a_i) + 3^2(z_i) + 3}{2} + \frac{2}{2}}{2} \\ & = \frac{3^2(2^{k_i}a_i) + 3^2(z_i) + 3 + 2}{2^2} \end{aligned}\end{equation}\tag{3}\label{eq3A}$$

The third one is

$$\begin{equation}\begin{aligned} T^{3}(x_i) & = \frac{3T^{2}(x_i) + 1}{2} \\ & = \frac{3\left(\frac{3^2(2^{k_i}a_i) + 3^2(z_i) + 3 + 2}{2^2}\right) + 1}{2} \\ & = \frac{\frac{3^3(2^{k_i}a_i) + 3^3(z_i) + 3^2 + 3(2)}{2^2} + \frac{2^2}{2^2}}{2} \\ & = \frac{3^3(2^{k_i}a_i) + 3^3(z_i) + 3^2 + 3(2) + 2^2}{2^3} \end{aligned}\end{equation}\tag{4}\label{eq4A}$$

By continuing this, the general result for $T^{q}(x_i)$ for any $1 \le q \le k_i$, which you can fairly easily prove by induction & which I'll leave to you to do, becomes

$$\begin{equation}\begin{aligned} T^{q}(x_i) & = \frac{3^{q}(2^{k_i}a_i) + 3^{q}(z_i) + \sum_{j=0}^{q-1}3^{q-1-j}2^{j}}{2^{q}} \\ & = \frac{3^{q}(2^{k_i}a_i) + 3^{q}(z_i) + 3^{q-1}\sum_{j=0}^{q-1}3^{-j}2^{j}}{2^{q}} \\ & = \frac{3^{q}(2^{k_i}a_i) + 3^{q}(z_i) + 3^{q-1}\sum_{j=0}^{q-1}\left(\frac{2}{3}\right)^{j}}{2^{q}} \\ & = \frac{3^{q}(2^{k_i}a_i) + 3^{q}(z_i) + 3^{q-1}\left(\frac{1-\left(\frac{2}{3}\right)^{q}}{1-\frac{2}{3}}\right)}{2^{q}} \\ & = \frac{3^{q}(2^{k_i}a_i) + 3^{q}(z_i) + 3^{q}\left(\frac{3^{q} - 2^{q}}{3^{q}}\right)}{2^{q}} \\ & = \frac{3^{q}(2^{k_i}a_i) + 3^{q}(z_i + 1) - 2^{q}}{2^{q}} \\ & = 3^{k_i}\left(2^{k_i-q}\right)a_i + \frac{3^{k_i}(z_i + 1)}{2^{q}} - 1 \end{aligned}\end{equation}\tag{5}\label{eq5A}$$

With $q = k_i$, \eqref{eq5A} becomes

$$T^{k_i}(x_i) = 3^{k_i}a_i + \frac{3^{k_i}(z_i + 1)}{2^{k_i}} - 1 \tag{6}\label{eq6A}$$

For $T^{k_i}(x_i)$ to be an integer requires the middle term's numerator to be a multiple of $2^{k_i}$. Since $\gcd(3^{k_i}, 2^{k_i}) = 1$, this gives for some integer $r$ that

$$2^{k_i} \mid 3^{k_i}(z_i + 1) \implies 2^{k_i} \mid z_i + 1 \implies z_i = r\left(2^{k_i}\right) - 1 \tag{7}\label{eq7A}$$

Thus, $r = 0$ gives $z_i = -1$ to be a solution. Also, the middle term in \eqref{eq5A} becomes $0$ so the equation simplifies to $T^{q}(x_i) = 3^{k_i}\left(2^{k_i-q}\right)a_i - 1$. As such, for each $q \lt k_i$, it's an odd integer, matching the requirement that these values all be odd. In addition, \eqref{eq1A} then becomes your Observation $1$, i.e.,

$$x_i = 2^{k_i}a_i - 1 \tag{8}\label{eq8A}$$

Note with $z_i = -1$ that \eqref{eq6A} simplifies to

$$T^{k_i}(x_i) = 3^{k_i}a_i - 1 \tag{9}\label{eq9A}$$

With the definitions being used, after $k_i$ iterations of applying $T$ starting with $x_i$, the set of odd numbers end and an even number is the result at this point (note this means $a_i$ must be odd). The value increases when $T$ is applied to each odd number, but it decreases with each even number, so $T^{k_i}(x_i)$ is a local maximum, i.e., it's your $y_i$. Thus, \eqref{eq9A} gives your Observation $2$, i.e.,

$$y_i = 3^{k_i}a_i - 1 \tag{10}\label{eq10A}$$