A list of predecessors as mentioned in my comment.
I document pairs of $(m,n)$ for consecutive $m$ and their 1-step predecessors $n$ such that $f(n)=m$. The value $n=0$ indicates, that $m$ has no predecessor. I didn't reflect, that one $m$ can have two predecessors, but if $n/2$ is odd, then $n/2$ is a second predecessor.(This makes the table more interesting, because all odd predecessors $n$ are overwritten by the even predecessors $2n$...
Moreover, a nearly periodic structure occurs. I tried to resemble this by the arrangement of three or four columns of $(m,n)$ such that the first column contains all $m$ which have no predecessor. The basic pattern is not really periodic, but has super-patterns which again seem to be periodic but actually aren't. This pattern-superpattern-structure is also recursive. It reminds me of a similar structure when I looked at $\beta=\log_2(3)$ and found a similar style of pattern-superpattern-supersuperpattern-... and is there related to the continued fraction of $\beta$.
So I think we'll get no nice description for the cases $m$ which have no predecessor...
m n m n m n m n
-------------------------------------------------------------
1 2 2 4
3 0 4 6 5 8
6 0 7 10 8 12 9 14
10 0 11 16 12 18
13 0 14 20 15 22 16 24
17 0 18 26 19 28
20 0 21 30 22 32
23 0 24 34 25 36 26 38
27 0 28 40 29 42
30 0 31 44 32 46 33 48
34 0 35 50 36 52
37 0 38 54 39 56
40 0 41 58 42 60 43 62
44 0 45 64 46 66
47 0 48 68 49 70 50 72
51 0 52 74 53 76
54 0 55 78 56 80 57 82
58 0 59 84 60 86
61 0 62 88 63 90
64 0 65 92 66 94 67 96
Update Some more explanation on the idea of "recursive aperiodic pattern".
If we list the values $m$ which have no predecessor, we get
m_k: 3, 6,10,13, 17,20,23,27,30,...
Writing the differences (I have prepended a zero-value to the above list of $m_k$)
,3,3,4 ,3,4 ,3,3,4 ,3,4 ,3,3,4 ,3,4 ,3,4 ,3,3,4 , ...
We note, that we have a pattern of two different words: 3,3,4
and 3,4
repeating, but aperiodical. Let's denote the longer one with the capital A
and the shorter one with the small a
(and A
means a difference of 10 and a
of 7).
We get
Aa Aa Aaa
Aa Aaa
Aa Aa Aaa
Aa Aaa
Aa Aa Aaa
Aa Aaa
Aa ...
Again we find only two kind of "words". Let's them shorten by Aaa
=B
and Aa
=b
. B
means now a difference of 24, b
of 17.
Then we get
bbB bB
bbB bB
bbB bB bB
bbB bB
bbB bB bB
bbB bB
bbB bB
bbB bB bB
...
Next obvious step gives
Cc Cc Ccc
Cc Ccc
Cc Cc Ccc
Cc Ccc
Cc Cc Ccc
Cc Ccc
...
with c
representing a difference of 17+24=41 and C
of 17+17+24=58.
And so on.
If I recall correctly, then with the mentioned case of working with $\beta = \log_2(3)$ the same style of recursive pattern reflected the convergents of the continued fractions of $\beta$.
The first few differences here match the convergents of the continued fraction of $\sqrt2$ so far:
a b c short patterns
-------------------------------------
[1 1 3 7 17 41 99 239 577 ... ] convergents of contfrac(sqrt(2))
[0 1 2 5 12 29 70 169 408 ... ]
-------------------------------------...
A/2 B/2 C/2 long patterns
Update 2 The above can be explained by the following:
- a number of the form $\lfloor2k\sqrt2\rfloor$ has exactly one predecessor $4k$;
- a number of the form $\lfloor(2k-1)\sqrt2\rfloor$ has exactly two predecessors $2k-1$ and $4k-2$;
- a number has no predecessors iff it has form $\lfloor n(2+\sqrt2)\rfloor$.
The first two statements are easily checked, while the third follows from the Beatty theorem, as explained in another answer by @Dattier
Update 3 Using a back-step algorithm (recursive) it seems I've got the predecessing tree of $m=73$. If no bugs, then this tree would also be complete. (But my routine may still be buggy, please check the results!)
The back-steps go from top-right south-west (antidiagonal) downwards. When there are two possible predecessors, they occur in the same column, but on separate rows.
If there is a predecessor without further predecessor, a short line (---
) is printed.
73 <--- start
104
148
105 ---
210
149
212
300 ---
298
211 ---
422
299
424
600 ---
598
423 ---
846 ---
---------------------------- tree seems to be complete (please check for errors!)
Best Answer
I think the phenomena you observe are explained more by the way you construct $B_0$ and $B_1$ than by the properties of primes (though properties of primes seem to play a role). If we suppose instead that $b_i$ were random, obtained by a coin flip, it's easy to see that for each bit of $B_0$ or $B_1$, the probability that the next bit is the same is $\frac{1}{3}$, and the probability that the next bit is different is $2/3$.
Indeed, say $b_i \equiv i \mod 2$ for some $i$, so at time $i$ we have just added $b_i$ to $B_0$.
Then if the next $k-1$ bits are not congruent to their position mod $2$, followed by the $k$th bit congruent to its position, $i+k$, mod $2$, we next append the bit $i+k \mod 2$ to $B_0$. This occurs with probability $\frac{1}{2^k}$, and in each sequence this occurs for exactly one value of $k$.
So we get a different bit with probability $\frac{1}{2} + \frac{1}{2^3} + \frac{1}{2^5} + \dots = \frac{2}{3}$, and the same bit with probability $\frac{1}{2^2} + \frac{1}{2^4} + \frac{1}{2^6} + \dots = \frac{1}{3}$. The $B_1$ case is identical.
So the probability of getting $10$ consecutive zeroes starting at any given point is $\frac{1}{2} \cdot \frac{1}{3^9}$. However, the length of time we expect to take before seeing the first run of $10$ consecutive zeroes is a little longer than the inverse of that, $3^{10}-1=59048$, by a classical argument.
So we waited about 17 times longer than one would expect to find a run of ten zeroes. (But you had 4 choices to pick to find such a long run without zeroes). That's not completely surprising, but it's a lot less strange than in the naive random model, where the expected wait time is $2^{11}-1$, so you would have waited about $489$ times longer than expected.
More generally, if we replace $b_i$ with a Markov chain with probability of taking the same value on adjacent steps $p$, then the probability that $B_0$ takes the same value on adjacent steps should be $$p^2 + p^2(1-p)^2 + p^2 (1-p)^4 + \dots = \frac{ p^2}{ 1- (1-p)^2} = \frac{p^2 }{ 2p -p^2} = \frac{p}{ 2-p}$$ since to get two of the same bit in $B_0$, $b_i$ must be same-same or same-different-different-same or same-different-different-different-different-same.
To explain $\frac{ p}{2-p} = \frac{1}{2} - .28 = .22$ as in dvitek's analysis, we need $p= \frac{2}{ 1/.22 +1} =.361$, i.e. there is a discrepancy of about $.14$ to explain in the more natural statistic of how often two adjacent primes have distinct second-to-last bits.