Compute the chances recursively.
Let $p_s(x)$ be the probability that exactly $x$ values, $0 \le x \le k$, are selected in all $s\ge 1$ independent draws of $k$ items (without replacement) from a population of $n \ge k \gt 0$ members. (Let's hold $n$ and $k$ fixed for the duration of the analysis so they don't have to be mentioned explicitly.)
Let $p_s(x\mid y)$ be the probability that if exactly $y$ values are selected in the first $s-1$ draws, then $x \le y$ of them are selected in the last draw. Then because there are $\binom{y}{x}$ subsets of $x$ elements of those $y$ elements, and $\binom{n-y}{k-x}$ subsets of the remaining $k-x$ elements are separately selected out of the other $n-y$ members of the population,
$$p_s(x\mid y) = \frac{\binom{y}{x}\binom{n-y}{k-x}}{ \binom{n}{k}}.$$
The law of total probability asserts
$$p_s(x) = \sum_{y=x}^k p_s(x\mid y) p_{s-1}(y).$$
For $s=1$, it's a certainty that $x=k$: this is the starting distribution.
The total computation needed to obtain the full distribution up through $s$ repetitions is $O(k^2 s)$. Not only is that reasonably quick, the algorithm is easy. One pitfall awaiting the unwary programmer is that these probabilities can become extremely small and underflow floating-point calculations. The following R
implementation avoids this by computing the values of $\log(p_s(x))$ in columns $1, 2, \ldots, s$ of an array.
lp <- function(s, n, k) {
P <- matrix(NA, nrow=k+1, ncol=s, dimnames=list(0:k, 1:s))
P[, 1] <- c(rep(-Inf, k), 0)
for (u in 2:s)
for (i in 0:k) {
q <- P[i:k+1, u-1] + lchoose(i:k, i) + lchoose(n-(i:k), k-i) - lchoose(n, k)
q.0 <- max(q, na.rm=TRUE)
P[i+1, u] <- q.0 + log(sum(exp(q - q.0)))
}
return(P)
}
p <- function(...) zapsmall(exp(lp(...)))
The answer to the question is obtained by letting $s=5,$ $n=10000=10^4$, and $k=100=10^2$. The output is a $101\times 5$ array, but most of the numbers are so small we may focus on very small $x$. Here are the first four rows corresponding to $x=0,1,2,3$:
p(5, 1e4, 1e2)[1:4, ]
The output is
1 2 3 4 5
0 0 0.3641945 0.9900484 0.9999 0.999999
1 0 0.3715891 0.0099034 0.0001 0.000001
2 0 0.1857756 0.0000481 0.0000 0.000000
3 0 0.0606681 0.0000002 0.0000 0.000000
Values of $x$ label the rows while values of $s$ label the columns.
Column 5 shows the chance that one element appears in all five samples is minuscule (about one in a million) and there's essentially no chance that two or more elements appear in all five samples.
If you would like to see just how small these chances are, look at their logarithms. Base 10 is convenient and we don't need many digits:
u <- lp(5, 1e4, 1e2)[, 5]
signif(-u[-1] / log(10), 3)
The output tells us how many zeros there are after the decimal point:
1 2 3 4 5 6 7 8 9 10 ... 97 98 99 100
6.0 12.3 18.8 25.5 32.3 39.2 46.2 53.2 60.4 67.6 ... 917.0 933.0 949.0 967.0
Numbers in the top row are values of $x$. For instance, the chance of exactly three values showing up in all five samples is found by computing exp(u[4])
, giving $0.000\,000\,000\,000\,000\,000\,1434419\ldots$ and indeed this has $18$ zeros before the first significant digit. As a check, the last value $967.0$ is a rounded version of $967.26$. $\binom{10000}{100}^{-4}$ (which counts the chances that the first sample reappears in the next four samples) equals $10^{-967.26}.$
For the 1st part you should probably learn more about the geometric distribution and possibly the negative binomial distribution.
The geometric distribution models the number of independent trials needed before you see the first success (or sometimes the number of failures before the 1st success). The negative binomial is an extension of this, it is how many trials (or failures) before seeing $k$ successes (the geometric is a special case with $k=1$). So with a 25% chance of success on each try (agreeing with your computations and assuming each trial is independent) then the probability (based on the geometric) of matching the 1st and last in 4 tries or less is only about 68%, is that a "High Probability"? to get over 95% you need 11 tries.
For part 2.1 this is easiest to compute as the opposite of matching 0 correctly. So for each of 8 bits there is a 50% chance of not matching it, so the probability of not matching any is $0.5^8$ and the probability of matching any (at least one) is therefore $1-0.5^8 = 0.996$.
The probability of matching exactly $n$ bits (but not caring which $n$) will follow the binomial distribution.
For 2.2 you can now combine the binomial and geometric distributions (if I understand it correctly and the assumptions hold).
This all assumes that each try is independent, we don't learn anything from the previous try, just if it succeeded or failed.
If we learn something each time, such as that on the first try I found the 1st digit, but don't know on the rest (and it will not change for the 2nd try) then this becomes more like the coupon collectors problem.
Best Answer
Assumptions:
If these assumptions are true, then your answer and you're reasoning are correct.
Here's another way to look at it that might require less arithmetic. Reframe the question: each digit is sampled independently and identically uniformly on $0, 1, \dots, 9$. Then having the last 4 digits match for two 8 digit numbers is the same as the probability of having two 4 digit numbers match exactly (the other digits don't matter since they are independent of the last 4, and we're not interested in whether they match or not). The probability the last 4 digits match is $(1/10)^4$ which is what you wrote above. Taking this same perspective for the 12 digit numbers, it doesn't matter what the first 8 digits are, and you again have $(1/10)^4$ chance of getting matching last 4 digits.