A (geometric?) limit distribution for the “delicacy width” of prime numbers

asymptoticslimitsnumber theoryprime numbersprobability distributions

[Revised to ask only the second of my two previous questions. Revised again to include the approximate Geometric distribution, and to add another reference.]

A recent Quanta article refers to this paper by Filaseta & Southwick, which is behind a paywall. (EDIT: Presumably, Southwick's doctoral dissertation has similar content with no paywall.) The paper has the following Abstract (italics are mine):

Abstract: We show that a positive proportion of the primes have the property that if any one of its digits in base $10$, including its infinitely many leading $0$ digits, is replaced by a different digit, then the resulting number is composite.

Primes having the italicized property are called widely digitally delicate in the article, which mentions that no example of such a prime has yet been found, in spite of there being infinitely many of them (indeed, a positive proportion).

I take the Abstract to mean that $$P_\infty:=\lim_{n\to\infty}{\pi^*(n)\over\pi(n)}>0 $$
where $\pi()$ is the usual prime counting function and $\pi^*(n)$ counts the number of primes less than or equal to $n$ and satisfying the italicized property.

Now, suppose we define the delicacy width (which I'll just call the "width") of a prime as follows:

Definition: The width of a prime $p$ whose decimal expansion is $\ldots d_2 d_1 d_0$ (including infinitely many leading $0$s) is the length of the longest suffix in $\ldots d_2 d_1 d_0$ with the property that changing any single digit in the suffix changes the expansion to that of a composite number.

This defines the width as a function $W:PRIMES\to\mathbb{N}\cup\{\infty\},$ with $W(p)=\infty$ when the longest such suffix is the entire expansion of $p$; otherwise, $W(p)$ is finite and equal to the number of digits to the right of digit $r$, where $r$ is the rightmost digit that can be altered to change the prime into different noncomposite number.

Examples: All primes less than $97$ have width $0$, because in each case the very rightmost digit (having no digits to its right) can be altered to change the prime into another noncomposite; e.g., $89$ can be changed into the noncomposite $83.$ On the other hand, $97$ has width=$1$ because the $9$ (having one digit to its right) is the rightmost digit that can be altered (to $0,1,3,4,$ or $6$) to change the prime into a different noncomposite — altering the $7$ cannot accomplish this.

Question: Since the quoted Abstract implies that a positive proportion ($P_\infty$) of the primes have infinite width, does it follow that the remaining primes have widths $(0,1,2,\ldots)$ occurring in respective proportions $(P_0, P_1, \ldots)$ that sum to $1-P_\infty?$ Can it be concluded that all of these proportions are positive? In other words, does there exist a limit distribution $\{P_w\}_{w\in\mathbb{N}\cup\{\infty\}}$ with $(P_0+P_1+P_2+\ldots)+P_\infty=1,$ where

$$P_k:=\lim_{n\to\infty}{\pi_k(n)\over\pi(n)}>0 \quad\text {for all $k\in\mathbb{N}\cup\{\infty\}$}$$
and $\pi_k(n)$ counts the number of primes that are less than or equal to $n$ and have width equal to $k$?

Here are plots I've computed of the distribution of widths among the first $10^n$ primes, for $n=6,7,8,9(\text{blue})$ — there seems to plenty of "converging" yet to occur, if convergence is indeed what happens:

plot of width distribution in initial segments of the primes

The greatest widths occurring in those respective cases are $14, 16, 24,$ and $31,$ although the plots show only up to width=$8$. (Elsewhere, the widest prime I've found is $142243533671$, with width=$59$. As mentioned above, no one has yet found a prime of the kind described in the Abstract, i.e. a prime of infinite width.)

To get some inkling of what happens for much larger primes and yet keep the computing time feasible, I also looked at the width distribution among the first $10^6$ primes after each number of form $10^n,$ for $n=8,10,12,…,30(\text{blue})$ — the greatest width to occur overall was $53$, but I plot only up to width=$15$:

plot of width distributions in disjoint segments among larger primes


Approximate Geometric Distribution

Examining these plots numerically, it appears that among sufficiently large primes, the distribution of positive integer widths (i.e. $0<W<\infty$) is approximately geometric. That is, the width distributions appear to be approximately as follows (taking $P_0, P_1,$ and $P_\infty$ as parameters):

$$P_k =\begin{cases}
P_0 &\text{if }k=0\\
P_1\,a^{k-1} &\text{if }1\le k\lt\infty\\
P_\infty &\text{if }k=\infty
\end{cases}$$

where $a=1-{{P_1\over 1-P_0-P_\infty}}$ in order to make $P_0+(P_1+P_2+\ldots)+P_\infty=1.$

Here's a comparison of the empirical distribution for $n=30$ (blue) from the preceding picture, and a geometric distribution (red) fitted with $P_0=0.09833$ and $P_1=0.2354$ to match the empirical values, and taking $P_\infty=10^{-9}$ just as some small number (known to be positive even though no instance of infinite width has been found among more than $10^9$ primes). At the scale of the plot, the two distributions visually overlap:

plot comparing an empirical and fitted geometric distribution

NB: The width of a prime can be seen as the outcome of a sequence of "trials" on the successive digits $d_0,d_1,d_2,\ldots$, each trial asking "Can this digit be changed so as to change the expansion into that of another noncomposite number?" — if the answer is "yes", then the trial is a "success", otherwise it's a "failure". The width is then just the number of failures before an eventual success (if there is never a success, then the width is infinite). Empirically, it appears that only trials after $d_0$ (i.e. only the trials on $d_1,d_2,\ldots$) can be modelled as "Bernoulli trials", and so give rise to a geometric distribution only for $0\lt W\lt\infty.$

Best Answer

First, let’s dispense with a core misconception: the phrase “a positive proportion of primes” means that

$$\liminf_{x\to\infty} \frac{\pi^*(x)}{\pi(x)} > 0.$$

In particular, such a claim does not establish that the fraction above converges a limit, only that it stays bounded away from 0. Typically one establishes this by identifying a specific arithmetic progression that is well-positioned to have the desired property, and then use sieve methods to show that we only need to discard relatively few elements of that progression to ensure the property.

If the result in the paper was that the limit exists and is positive, it would be worded more like “we show that the (relatively) density of primes with property X exists and is positive”. If they further succeeded in computing the limit, they would instead say “we determine an expression for the density of primes satisfying X”.

I am not an expert on the methods of the paper, but the Quanta article and other references all cite the use of covering congruences, which strongly suggests that only a lower bound is proved. There may be a separate argument that a limit exists, but I’d expect the authors to make the stronger claim if so.

Now we get to the heart of the question:

In fact, we can prove that $P_k = 0$ for all finite $k$, contrary to the hypothesis in the question. A prime of width $k$ must belong to a pair of primes that differ in only the last $k+1$ digits, and thus differ by $ < 10^{k+1}$. But a standard upper bound sieve argument (similar to twin primes) shows that there are at most $O(x/(\log x)^2)$ pairs of primes that differ by any fixed constant. So the density of primes of width $k$ is $0$, although the convergence rate of $1/\log x$ is really rather slow so you couldn’t possibly have observed it in the sample data you collected.

At this point you may be wondering: but doesn’t this prove that $P_\infty = 1$? The answer is, counterintuitively, no. The reason is that densities of sequences are not the same as probabilities: they are only finitely additive, not countably additive. There are plenty of sequences of sets, each of density $0$, whose union has density $1$. The simplest case is that the naturals are the union of countably many singleton sets, one for each natural number.

For a more interesting case, it is well-known that the proportion of positive integers with exactly $r$ prime factors is $0$ for any $r>0$ (it is proportional to $x (\log \log x)^{r-1} / \log x$, with a constant reminiscent of a Poisson distribution). But that certainly doesn’t mean that most integers have infinitely many prime factors!

Related Question