Collatz Conjecture – Average Number of Steps

collatz conjecturelogarithms

I was playing with the Collatz conjecture and decided to check it for the first 100,000,000 natural numbers. I divided this range in equal intervals of 100,000 and calculated the average number of steps to reach 1 in each one of these intervals. By graphing these averages, I got the following result
enter image description here

Which seems to show a very clear logarithmic behavior. By doing a least squares regression (shown in blue), I derived that the average number of steps ($S$) for a number $n$ to reach 1 is
$$S\approx10.45\ln(n)-3$$
Is there a way to derive this logarithmic relation?

Best Answer

The function $S(n)$ can have its value estimated by considering the expected value $s(n)$, which corresponds to the average number of steps near $n$. For some "random" $n$, there is a $1/2$ probability that it is even or odd, therefore, $$s(n)=\frac{1}{2}\left(1+s\left(\frac{n}{2}\right)\right)+\frac{1}{2}\left(2+s\left(\frac{3n+1}{2}\right)\right)$$ If $s(n)=b\ln(n)$, then $$b\ln(n)=\frac{1}{2}\left(1+b\ln\left(\frac{n}{2}\right)\right)+\frac{1}{2}\left(2+b\ln\left(\frac{3n+1}{2}\right)\right)$$ Solving for $b$, $$b=\frac{3}{\ln\left(\frac{4}{3+1/n}\right)}$$ which, for $n\rightarrow \infty$, $$b=\frac{3}{\ln\left(4/3\right)}\approx10.43$$ Which matches the value found by the regression.

Note that $s(n)=b\ln(n)+c$ is also a solution to the first equation for any constant $c$, and the value of $b$ does not change as all terms with $c$ are cancelled out. Determining its value is addressed in another post I made.