I recently talked to a friend about this problem:
Someone is trying to brute-force a secret PIN code, consisting of 4 digits (0-9).
There are actually 10 distinct codes that are acceptable, and guessing any of these would do.
The person trying to guess has a maximum of 5 tries.
What is the probability that they would guess a correct code?
Now, my thinking goes like this: For an attempt there's a $\frac{10}{10^4} = 0.1\%$ chance of success and since there are 5 tries, a total chance of 0.5% would be a good baseline, but the actual odds should be slightly more, because every failure to guess decreases the pool of possible codes.
So, computed the probability to guess as follows:
Attempt # Correct codes left Total codes p(right)
1 10 10000 0.1 %
2 10 9999 0.1000100010001%
3 10 9998 0.1000200040008%
4 10 9997 0.1000300090027%
5 10 9996 0.1000400160064%
SUM 0.5001000300100%
The probability to guess in max 5 attempts is the sum of the probabilities of guessing in any of the 5 attempts.
So I got the result of $0.5001000300100\%$, which matches my intuition.
However, my friend suggested another approach: let's compute the probability of being wrong on all the 5 attempts ($p(all\ 5\ wrong)$). The probability to guess must be $p(guess\ in\ 5) = 1 – p(all\ 5\ wrong)$.
So, we computed this as follows:
Attempt Wrong codes left Total codes p(wrong)
1 9990 10000 99.9 %
2 9989 9999 99.8999899989999%
3 9988 9998 99.8999799959992%
4 9987 9997 99.8999699909973%
5 9986 9996 99.8999599839936%
Now, we figured that
$$p(all\ 5\ wrong) = \prod\limits_{i\le 5} p(wrong_i) = 99.5008993700451\%$$
That means that
$$p(guess\ in\ 5) = 1 – p(all\ 5\ wrong) = 1 – 99.5008993700451\% = 0.4991006299549\%$$
which is a surprising result, because it's less than the baseline $0.5\%$.
I feel like the second approach is wrong, but cannot figure out why.
Best Answer
The second approach is right and the first one has a mistake.
Here's a suggestion for how you can see the mistake: what if everything were the same about this problem, except that there were just 16 total codes (10 of which were valid) and you had 5 guesses to find them? What would happen if you repeated your first approach in the exact same way?
EDIT: Let me try to go a bit further than just convincing you that the first one is wrong; here's a bit on why it's wrong.
The short answer for what's broken is that you're overcounting probabilities. When you add probabilities, you're implicitly assuming that the underlying events are disjoint -- that is, they can't occur together. However, that's not the case here. It's quite possible to guess correctly on attempt 1 and to guess correctly on attempt 2.
If you want to proceed in this fashion, you need to restyle your events:
Event 1: You guess right on your first try (probability: $10/10000$)
Event 2: You don't guess right on your first try, but you do on your second (probability: $\frac{9990}{10000} \cdot \frac{10}{9999}$)
Event 3: You don't guess right on your first two tries, but you do on your third (probability: $\frac{9990}{10000} \cdot \frac{9989}{9999} \cdot \frac{10}{9998}$)
etc. By doing this, you'll have constructed disjoint events, which means it's appropriate to add their probabilities. You will then recover the other answer.