At the bottom of this post, I've explained what motivated me to ask this question – an entertaining mathematical observation, even if you choose to ignore the rest of this post.
We have a random sequence of digits, $X_1$, $X_2$, … where each $X_i$ is independently randomly selected from {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}.
Choose integer $n \ge 1$.
What is the probability that for all integer $j$ satisfying $10^{n-1} \le j < 10^n$ the following condition is met:
$\large \sum_{i=0}^{n-1} 10^{n-i-1}X_{j+i} \ne j$
In other words, the probability that for all $n$-digit integers j, the string of digits starting at position $j$ do NOT make the number $j$. Call this probability $P_n$.
For $n = 1$, the answer is simple: $P_1 = (\frac{9}{10})^9 \approx 39\%$
For $n = 2$ and $n=3$, I've run a few random simulations, and it looks as if $P_n$ is around $40 \%$ in both cases.
Taking a naive approach, if we picked any particular $j$, the probability that the number string $X_j X_{j+1} X_{j+2} … X_{j+n-1}$ does not equal j, would be $1-(\frac{1}{10})^n$, and the probability that this happened in $(10^n – 10^{n-1})$ independent trials would be
$\large [1-(\frac{1}{10})^n]^{10^n – 10^{n-1}}$ an expression which is around $40 \%$ for $n > 1$ (It rapidly converges to $e^{-0.9} \approx 40.7 \%$ as $n \to \infty$)
This isn't quite correct for $P_n$ because the n-digit numbers we are considering overlap, so are not independent, but maybe the dependence is weak enough that the naive approach is close to the correct answer. I started to analyse the problem considering tree diagrams of possible outcomes, but it got messy and my question is whether there is a slicker way to crack the problem: how to calculate $P_n$ (or put some bounds on its value).
I was prompted to consider this question, because I learnt that if you look at the decimal digits of $\pi$ (after the decimal place), the digits starting at the 36,541,622,473th place are 36541622473. Wow! I know $\pi$ is not a random number but its digits have quasi-random properties and I was intrigued to know how likely it is to find such a result (or as I've posed the problem, how likely for no such result to occur). If my estimate is right, the chances are 60% so not that surprising, although it is impressive that someone found an 11-digit example.
Best Answer
If we fix $n$ then the digits of interest are $(X_a, X_{r+1}, ... X_b)$ where $a = 10^{n-1}$ and $b=10^n+n-2$. Let $T = b-a+1$
Define $D \subset \mathbb Z^{T}$ where $D=\{(x_a,...x_b):0 \le x_i \le 9 \}$
So $|D|=10^T$. $(X_a, X_{a+1}, ... X_b)$ is equally likely to be any member of $D$.
For $10^{n-1} \le j < 10^n$ (ie $j$ is an $n$-digit number), and $\mathbf x \in D$ we define $\lambda(j,\mathbf x)$ to be an indicator of a match, ie
$\lambda(j,\mathbf x) = \left\{ \begin{array}{lc} 1 & \quad \sum_{i=0}^{n-1} 10^{n-i-1}X_{j+i} = j \\ 0 & \text{otherwise} \end{array} \right.$
Because the condition $\lambda(j,\mathbf x) = 1$ fixes $n$ elements of $\mathbf x$, leaving the remaining $T-n$ elements to take any value from 0 to 9,
$\large \sum_{\mathbf x \in D} \lambda(j,\mathbf x) = 10^{T-n}$
The total number of matches over all $D$ is
$\large \sum_{\mathbf x \in D} \sum_{j=10^{n-1}}^{10^n-1} \lambda(j,\mathbf x) $
$=\large \sum_{j=10^{n-1}}^{10^n-1} \sum_{\mathbf x \in D} \lambda(j,\mathbf x)$
$=\large \sum_{j=10^{n-1}}^{10^n-1} 10^{T-n}$
$=(10^n-10^{n-1})10^{T-n}$
$=10^n(1-\frac{1}{10})10^{T-n}$
$=\frac{9}{10}10^T$
If we divide this by $|D|$ we obtain the mean number of matches, which is
$\frac{9}{10}$ or $0.9$.
Note how this is independent of $n$: on average, we expect just under 1 match among the 90 2-digit positions, and equally just under 1 match among the 9000 4-digit positions.
This indicates a match is a rare event, and although if one match occurs having a second match is not entirely independent, it is reasonable to approximate the number of matches as a Poisson distribution with mean $0.9$ (with the approximation improving rapidly as $n$ increases). So, approximately:
Probability of no matches is $e^{-0.9} \approx 40.7\%$
Probability of one match is $e^{-0.9}0.9 \approx 36.6\%$
Probability of two matches is $\frac{e^{-0.9}0.9^2}{2} \approx 16.5 \%$
Probability of more than two matches is approximately $6.3 \%$
These are broadly confirmed by running some random simulations for $n=2$ and $n=3$.