Number Theory – Trivial Proof of Bertrand’s Postulate

Write the integers from any $n$ through $0$ descending in a column, where $n \geq 2$, and begin a second column with the value $2n$. For each entry after that, if the two numbers on that line share a factor, copy the the entry unchanged, but if they're coprime, subtract $1$.

We'll refer to the first column as $a$, where each value is the same as its index, and the second column as $b$, where the $a$th row's entry is $b_a$. The $0$-index refers to the bottom row. Equivalently,

$$
b_a = \begin{cases} 2n & \textrm{if } a = n \\ b_{a+1} – 1 & \textrm{if }\gcd(a+1,b_{a+1})=1 \\ b_{a+1} & \textrm{otherwise}\end{cases}$$

Consider the following example where $n=8$. I've also included a column showing $\gcd(a,b_a)$, and colored those $b_a$ that share a factor with $a$ and thus don't change.

$$
\begin{array}{|c|c|c|}
\hline
a & b_a & (a,b_a) \\ \hline
8 & \color{red}{16} & 8 \\ \hline
7 & 16 & 1 \\ \hline
6 & \color{red}{15} & 3 \\ \hline
5 & \color{red}{15} & 5 \\ \hline
4 & 15 & 1 \\ \hline
3 & 14 & 1 \\ \hline
2 & 13 & 1 \\ \hline
1 & 12 & 1 \\ \hline
0 & 11 & 11 \\ \hline
\end{array}
$$

Assertion: $b_0$ will always be prime.

Why? Well, suppose not, that some smaller prime $p<b_0$ divides it. In particular, let $p$ be the smallest prime factor that divides $b_0$. Since $b_0 \neq b_n$, and $p\geq 2$, we have $p<n$, so if a prime does divide $b_0$, it must be in our column of $a$ values.

$p \mid b_0 \implies p \mid b_p$. This is because $p$ can only divide $b_0$ if it has already been established by dividing $b_{kp}$ for some $k\geq 1$. A factor cannot have its first appearance at $b_0$ unless it is prime.

That said, $p \mid b_p \implies b_p = b_{p-1}$. However, that means $b_{p-1} \not\equiv b_0 \pmod {p}$, regardless of which $b_a$ decrement or not; there are one too few to make it back to our asserted divisibility, and we're left with $b_0 \not\equiv 0 \pmod {p}$, i.e. $p \nmid b_0$, a contradiction. (Recall that $b_1 – b_0 = 1$ always, preventing a constant $0 \pmod p$ all the way down.)

$$
\begin{array}{|l|l|} \hline
n & 2n \\ \hline
\dots & \dots \\ \hline
p & b_p \equiv 0 \pmod{p} \\ \hline
p-1 & b_{p-1} \equiv 0 \pmod {p} \\ \hline
p-2 & b_{p-2} \equiv \{0\text{ or } p-1\} \pmod{p} \\ \hline
p-3 & b_{p-3} \equiv \{0\text{ or } p-1 \text{ or }p-2\} \pmod{p} \\ \hline
\dots & \dots \\ \hline
0 & b_0 \not\equiv 0 \pmod{p} \\ \hline
\end{array}
$$

Conclusion: As we've established there can be no smallest prime factor dividing $b_0$, it must be prime. Now that we have prime $b_0$, we can apply the same process arbitrarily with any $n$, and immediately we've shown there exists a prime in any $(n,2n)$ interval.

It's pretty clear I got the logic wrong for an important chunk of the proof, and I'm working on a clever way to solve that, but in the meantime, I have an idea for a less elegant fix.

If you look at the actual mechanism of what's going on, it's basically this. The subtracting one only when coprime essentially maintains a number (the difference $b_a – a$ for any $a$) which it's trying to rule out as a prime. This starts off as $n$, which is automatically bumped up to $n+1$ on the next line since $n \mid 2n$. Thereafter, whenever any factor in $a$ is shared by a factor in $b_a -a$, it's marking $b_a-a$ as composite and moving on. You can see that in this partial chart for $n=113$, where the right hand column is just the difference of the first two:

$$
\begin{array}{|l|l|l|} \hline
113 & 226 = 2 \cdot 113 & 113 \\ \hline
112 = 2^4\cdot 17 & 226 = 2 \cdot 113 & 114=2\cdot 3 \cdot 19 \\ \hline
111 = 3\cdot 37 & 226 = 2 \cdot 113 & 115 = 5 \cdot 23 \\ \hline
110 = 2\cdot 5\cdot 11 & 225 = 3^2 \cdot 5^2 & 115 = 5 \cdot 23 \\ \hline
109 & 225 = 3^2 \cdot 5^2 & 116 = 2^2 \cdot 29 \\ \hline
108 = 2^2 \cdot 3^3 & 224=2^5 \cdot 7 & 116 = 2^2 \cdot 29 \\ \hline
107 & 224=2^5 \cdot 7 & 117 = 3^2 \cdot 13 \\ \hline
106 = 2 \cdot 53 & 223 & 117 = 3^2 \cdot 13 \\ \hline
105 = 3 \cdot 5 \cdot 7 & 222 = 2\cdot 3 \cdot 37 & 117 = 3^2 \cdot 13 \\ \hline
104 = 2^3 \cdot 13 & 222 = 2\cdot 3 \cdot 37 & 118 = 2\cdot 59 \\ \hline
103 & 222 = 2\cdot 3 \cdot 37 & 119 = 7 \cdot 17 \\ \hline
102 = 2 \cdot 3 \cdot 17 & 221=13 \cdot 17 & 119 = 7 \cdot 17 \\ \hline
101 & 221=13 \cdot 17 & 120 = 2^3 \cdot 3 \cdot 5 \\ \hline
100 = 2^2 \cdot 5^2 & 220 = 2^2 \cdot 5 \cdot 11 & 120 = 2^3 \cdot 3 \cdot 5 \\ \hline
\dots & \dots & \dots \\ \hline
88 = 2^3 \cdot 11 & 214 = 2 \cdot 107 & 126 = 2 \cdot 3^2 \cdot 7 \\ \hline
87 = 3 \cdot 29 & 214 = 2 \cdot 107 & 127 \\ \hline
86 = 2 \cdot 43 & 213 = 3 \cdot 71 & 127 \\ \hline
\dots & \dots & \dots \\ \hline
\end{array}
$$

It takes $14$ non-decrements, which is exactly the amount needed to get from $113$ through the big gap there up to the next prime $127$, and thereafter there are no more shared factors and it stays $127$ the whole way down, and it does indeed always work like this.

So the size of the prime gap is one determiner of how long that "trial division" section lasts, and the other is the size of the factors themselves. As I said, any factor present will do, and I can't discern much rhyme or reason to it, so that leaves us with making a worst-case upper bound estimate of the sum of the least prime factors comprising every number in the prime gap. In this example, I think that adds up to $60$ or so, but it's one of if not the worst case around.

To make this rigorous, we can use the current best upper bound established on prime gap size for sufficiently large $x$ of $x^{0.525}$. If we consider some large $x$ as having a gap of that size, we can immediately mark half of those entries as being even, which means in the worst case, it would require two $a$-decrements to move past each of those entries within the gap. So that half of the gap is just

$$x^{0.525} / 2 \times 2 = x^{0.525},$$

and leaves us half left to deal with. Here, we could undoubtedly continue to whittle down our estimate by taking out other small factors, but I'm not sure that really helps anyway. Ignoring removing small factors, our bottom line is that we need

$$x^{0.525} x^k < x,$$

where $x^k$ represents an upper bound for the sum of the least prime factors in that gap, and it looks like we need $k<0.475$. I would expect that $x^k$ to work out to something more like $\log{x}$, but I'm not aware of any bounds that immediately say that.

So no, this isn't a completed proof either, but I thought I would share some of my thinking. I'm still hoping for a nice elegant solution to pop out. That said, if this approach can be made to work, that should instantly prove my approach valid for large $n$… but of course, using something more powerful than Bertrand's postulate to help do it sort of defeats the purpose. More updates later.

One other thing worth a mention. There's an easy workaround for scenarios where this fails. If $b_0=cd$, some composite, restart the process using $c, (c+1)d$, and repeat as necessary. This lets you do fun stuff like hit the prime values in $p(p+1)$.

For example, starting with $\{29, 29\cdot 30\}$ will yield $b_0=851=23\cdot 37$. Restart with $\{23, 23\cdot 37 + 23\}$, and you'll get a valid $b_0=853$. This seems to work fine empirically, but I have to doubt there's any way to justify it rigorously.

Update: Just a quicky. I got to thinking about Arnaud's note about reverse engineering, and I've got an idea to float. I tried doing some mapping of the origin possibilities for various $b_0$, and while the primes are nice and robust, the composites are not. The very best they have to offer in the first 500 or so is probably:

which makes sense, what with $209$ being a larger semiprime and $233$ up top being one half of a problem semiprime that shows up a bit.

I had hoped that that possibility graphs for the primes could be infinite, but if my code is right, it turns out they're merely far larger than the non-primes. Here's a sample:

\begin{array}{|l|l|l|l|} \hline
\mathbf{b_0} & & \textbf{nodes} & \textbf{max length} \\ \hline
101 & 101 & 6206 & 818 \\ \hline
102 & 2\cdot 3\cdot 17 & 1 & 0 \\ \hline
103 & 103 & 9779 & 918 \\ \hline
104 & 2^3\cdot 13 & 1 & 0 \\ \hline
105 & 3\cdot 5\cdot 7 & 4 & 2 \\ \hline
106 & 2\cdot 53 & 1 & 0 \\ \hline
107 & 107 & 11059 & 1074 \\ \hline
108 & 2^2\cdot 3^3 & 1 & 0 \\ \hline
109 & 109 & 6293 & 1094 \\ \hline
110 & 2\cdot 5\cdot 11 & 1 & 0 \\ \hline
111 & 3\cdot 37 & 4 & 2 \\ \hline
112 & 2^4\cdot 7 & 1 & 0 \\ \hline
113 & 113 & 8886 & 1184 \\ \hline
114 & 2\cdot 3\cdot 19 & 1 & 0 \\ \hline
115 & 5\cdot 23 & 8 & 4 \\ \hline
116 & 2^2\cdot 29 & 1 & 0 \\ \hline
117 & 3^2\cdot 13 & 4 & 2 \\ \hline
118 & 2\cdot 59 & 1 & 0 \\ \hline
119 & 7\cdot 17 & 44 & 14 \\ \hline
120 & 2^3\cdot 3\cdot 5 & 1 & 0 \\ \hline
121 & 11^2 & 70 & 22 \\ \hline
122 & 2\cdot 61 & 1 & 0 \\ \hline
123 & 3\cdot 41 & 4 & 2 \\ \hline
124 & 2^2\cdot 31 & 1 & 0 \\ \hline
125 & 5^3 & 20 & 8 \\ \hline
126 & 2\cdot 3^2\cdot 7 & 1 & 0 \\ \hline
127 & 127 & 12230 & 1268 \\ \hline
\end{array}

I also analyzed some parameters from the first $15000$ non-prime graphs. There are a few strong correlations, particularly that between large semiprimes and larger graphs, but the most promising find is what looks like a strong bound on the ratio of total nodes in the graph to $b_0$. It was $<1$ always, and looked to be decreasing, suggesting a strong bound might be possible. (This same ratio was $>1$ for all primes, and scaled very close to linearly.)

Since the maximum length (or height if you like) of the graph is the critical piece that determines whether or not this whole conjecture works, and since that length is a subset of the total graph, a hard limit on the number of nodes would effectively be a proof that the conjecture holds up.

To be clear, "nodes" correspond to starting pairs of numbers which would lead to a given $b_0$. The pair of numbers in question are the ones we previously called $n$ and $2n$, but we're being more flexible now. So, if it turned out there were some compelling reason why any given composite $m$ must have less than $m$ different starting pairs that led to its being $b_0$, that would be sufficient for a proof.

Latest attempt

All right. I'm going to try justifying the original $(n,2n)$ approach again.

First, however, I think it serves to look at $(n,n+2)$ as the seed pair. $n=16$ looks good for illustrative purposes. Here's a chart for it; as someone else pointed out, the $b$ column is unnecessary in this case. We could replace it with $c=b-a$, which is more clear and will share all of $b$'s relevant factors, since we're only interested in where $a$ and $b$ overlap. That said, we'll leave it in for this one.

$$
\begin{array}{|ll|ll|ll|} \hline
\textbf{a} & & \textbf{b} & & \textbf{c} & \\ \hline
16 &= 2^4 & 18 &= 2 \cdot 3^2 & 2 & \\ \hline
15 &= 3\cdot 5 & 18 &= 2 \cdot 3^2 & 3 \\ \hline
14 &= 2 \cdot 7 & 18 &= 2 \cdot 3^2 & 4 &= 2^2 \\ \hline
13 & & 18 &= 2 \cdot 3^2 & 5 & \\ \hline
12 &= 2^2 \cdot 3 & 17 & & 5 & \\ \hline
11 & & 16 &= 2^4 & 5 & \\ \hline
10 &= 2 \cdot 5 & 15 &= 3\cdot 5 & 5 & \\ \hline
9 &= 3^2 & 15 &= 3\cdot 5 & 6 &= 2 \cdot 3 \\ \hline
8 &= 2^3 & 15 &= 3\cdot 5 & 7 \\ \hline
7 & & 14 &= 2 \cdot 7 & 7 & \\ \hline
6 &= 2 \cdot 3 & 14 &= 2 \cdot 7 & 8 &= 2^3 \\ \hline
5 & & 14 &= 2 \cdot 7 & 9 &=3^2 \\ \hline
4 &= 2^2 & 13 & & 9 &=3^2 \\ \hline
3 & & 12 &= 2^2 \cdot 3 & 9 &=3^2 \\ \hline
2 & & 12 &= 2^2 \cdot 3 & 10 &=2\cdot 5 \\ \hline
1 & & 12 &= 2^2 \cdot 3 & 11 &\\ \hline
0 & & 11 & & 11 &\\ \hline
\end{array}
$$

We're using the same system here for determining the successive values in $b$ as we did earlier: subtract $1$ when coprime with $a$, otherwise move it down unchanged.

I'd mostly like to use this table to point out there's nothing magical or inexplicable happening here. It's probably most clear in $c$: we're simply counting up from $2$, and keeping each value until it matches with a factor in $a$, and then we increment by one. Any factor at all will do, so long as it's shared with $a$.

A few things to notice. First, since $a$ is ascending with no pauses and $c$ is descending but waiting for a match before incrementing, it's natural that $c$ will grow more slowly, but given the large number of small factors available as $a$ flies by, it'll still grow a respectable amount.

Second thing, and this is really the important one, is to note the $11$ at the bottom of the column. Our whole system is predicated on the notion that this number will always be a prime, provided you plug in reasonable seed values. And this table shows why.

To state the obvious first off, we had to end on something. We didn't know it had to be prime, perhaps, but it's obvious $c$ was counting up and going to end somewhere. More to the point, note that we're not claiming that it's going to reach any specific prime yet, just that it's slowly growing. So the question is, why should we expect that bottom value to be prime necessarily?

Look at the penultimate prime, the $7$. It won't always be $7$, but there will always be a next-to-last prime, and after we hit it, there's often a spatter of small factor annihilation just like we see below. Whether this happened at $7$ or at $737$, the space and factors needed to bridge the gap to the next prime will always be available.

The upshot is that a prime will always be waiting there, since obviously no big factors are showing up between $1$ and $0$. In particular, only smaller factors come after the penultimate prime. Usually there's plenty of room; this example shows as close as it ever gets to having the prime superseded by small factors.

I realize this isn't proof-level justification that that can never happen. That said, I think I could explicitly point out a sufficiently covering bijective mapping of factors from one column to the other that always takes place, but at the moment I'm satisfied if that was persuasive.

And that's the bulk of it. I think taking $(n,n+2)$ better illustrates the underlying mechanism, but if you look closely, you'll notice that line $7$ with $14$ next to it. That means that from there down, this chart is identical to if we had used $(7,14)$ as our seed pair from the outset.

The same applies to any $(p,2p)$; there are arbitrarily many $(n,n+2)$ charts that can be cut off to get whatever pair you like. Presumably this is true for $(n,2n)$ as well, although we'll avoid that just to play it safe. And of course there is no need of actually finding such charts; if you subscribe to the validity of the example process, that should suffice to show the validity of using any $(p,2p)$ as a seed pair.

A couple of closing notes, then. When we do use $(p,2p)$, it has the additional handy feature of providing not only a prime in that range, but the very next prime larger than $p$. This should make sense after having seen our example.

And finally, do note that this gives us what we were after all along: a proof of primes in every $2n$ interval. We can of course apply this as much as we like using any arguments we want for $p$. Some of my additional data also suggest that after five or ten early exceptions have passed, we should be able to use $(4p,5p)$, and sometime after getting up to $1000$ or so, $(9p,10p)$ and even $(19p,20p)$, giving us far tighter bounds on those intervals.

I think that covers it. So what crucial element did I miss this time? Specifically, is the factor matchup stuff a critical tricky part which defeats the whole purpose if I omit it, or is it as straightforward to actually prove as I hope?

(…actually since writing that, I went and ran a battery of tests against that general principle of factor matching. It is ROBUST. This is the least of what it can do reliably. Still not a proof, but I'm much more convinced one would be easy to come up with now.)

Best Answer

Partial answer.

Conjecture 1: $b_0$ is the smallest prime larger than $n$.

Conjecture 2: $b_0$ is always a prime number as soon as $b_n$ is greater than $n+1$ and lower than some increasing bound. For a fixed $n$, all those prime values of $b_0$ make up a set of consecutive primes.

What is proved so far:

Regarding Conjecture 1

If the bottom-right value is a prime, then it's the smallest prime larger than $n$.

The conjecture is true when the gap between $n$ and the next prime is $|p-n|\leq 4$

Regarding Conjecture 2

The table below shows the range of $b_n$ values for which $b_0$ is a prime.

Proof of Conjecture 1 in the case where $n=p-1$ with $p$ prime.

The second row is $(p-2, p+(p-2))$, which are coprime numbers, and therefore by an immediate induction since $p$ is prime you can see that every subsequent row is of the form $$(a,p+a)$$ down to the last row $(0,p)$ as promised.$\,\,\square$

Proof in the case where $n=p-2$ with $p$ prime ($p>2$).

The second row is $(p-3, 2(p-2))$ and these two are not coprime: since $p>2$ is prime, $p-3$ is even. Therefore the third row is $(p-4, (p-4)+p)$ and from here we conclude the same way as before. $\,\,\square$

Proof in the case where $n=p-3$ with $p$ prime.

There you start to see some new arguments, where the proof is not constructive.

The second row is $(p-4, (p-4)+(p-2))$. They're coprime since $p$ is odd. You go down to $(p-5, (p-5)+(p-2))$. As long as you keep coprime pairs, you go down as $(p-k, (p-k)+(p-2))$. But the trick is that $p-2$ can't be prime, otherwise you wouldn't be in the case $n=p-3$, $p$ prime but rather $n=q-1$, $q$ prime (first case treated above) with $q=p-2$. So at the very least, when $a$ becomes a factor of $p-2$, you will get $(a,a+(p-2))$ and from there get down to $(a-1,(a-1)+(p-1))$.

From then on you can't stay at a difference $b-a=p-1$ for long, since $p-1$ is even. As soon as $a$ becomes even you will get up to $b-a=p$ and win.$\,\,\square$

Proof (sketch) in the case where $n=p-4$ with $p$ prime ($p>2$).

The proof for $n=p-3$ can be repeated: you're going to get rid of the difference $b-a=p-3$ very fast since $p$ is odd, you're getting rid of $b-a=p-2$ sooner or later since $p-2$ can't be a prime, and then you're getting rid of $b-a=p-1$ in at most two moves since $p$ is odd.$\,\,\square$

One problem in the general case is that you can't reverse-engineer the table, e.g. $(1,8)$ could come from $(2,8)$ or it could come from $(2,9)$.
If you add a column $b-a$, it starts at $n$, and goes non-decreasing. If it ever reaches a prime number, then it will stay at that prime number, since from then down you will have $(a=k, b=p+k)$ down to $(0,p)$ and the output will therefore be the smallest prime greater than $n$.
So all you've got to do is prove that you do reach a prime at some point. You could try to do that assuming Bertrand's postulate, it would already be some achievement.

Latest attempt

Best Answer

Related Solutions

Related Question