[Math] Probability of collecting all 4 different items while picking 1 random item from the set

combinatoricsprobability

a.) There are $4$ distinct items in the set. What is the probability of picking all $4$ items after picking $n\ge4$.
b.) How many items do you need to pick to collect all four with a probability of at least $.9$.

My answer for part a. (which I think is wrong):
The sample space is the all the possible ways of picking $n$ items from the set of $4$ items.
That number is equal to the number of non negative integer-valued solutions to this problem:
$a_1+a_2+a_3+a_4=n$
that number is $\binom{n+4-1}{4-1}=\binom{n+3}{3}$

All possible ways of picking $n$ items and having at least 1 of each of the $4$ items is the number of positive integer-valued solutions to this problem:
$a_1+a_2+a_3+a_4=n$
That number is $\binom{n-1}{4-1}=\binom{n-1}{3}$

So the probability of picking all $4$ items after picking $n$ items is:
$\frac{\binom{n-1}{3}}{\binom{n+3}{3}}$

Using this answer, I find that the probability of picking all $4$ items after picking $4$ random items is:
~$.02857\ne\frac{4}{4}\frac{3}{4}\frac{2}{4}\frac{1}{4}$

Please explain where I went wrong.

Best Answer

The original problem is incompletely specified. And one problem with a "stars and bars" approach is that not all solutions are equally likely, making the calculation of probabilities very difficult.

We consider the following equivalent situation. An experiment consists of picking one of the digits $1$, $2$, $3$, $4$, with each choice equally likely. (a) If we perform this experiment $n$ times, what is the probability that we have chosen each of $1$, $2$, $3$, $4$ at least once? (b) What is the smallest $n$ such that if we perform the experiment $n$ times, the probability of having picked each digit at least once is at least $0.9$?

For (a), represent the result of the $n$ experiments as a word of length $n$ over the alphabet $\{1,2,3,4\}$. There are $4^n$ such words, all equally likely. Now we count the words that have all four digitss. As is often the case, it is a little easier to count the complement, the words in which not all of $1$, $2$, $3$, and $4$ appear.

To do this, we will use Inclusion-Exclusion. The number of words that do not have a $1$ is $3^n$. So is the number of words that do not have a $2$, and so on. Add up, for a total of $4\cdot 3^n$. This doesn't quite work, since we have double counted, for example, the words that have neither a $1$ nor a $2$. There are $2^n$ of these. But there are $\binom{4}{2}=6$ double counted patterns. This gives us the estimate $4\cdot 3^n -6\cdot 2^n$ for the number of words with at least one missing digit. However, we have subtracted once too many times the words with just the digit $1$, or just $2$, and so on. So the total number of words with at least one missing digit is $$4\cdot 3^n -6\cdot 2^n +4\cdot 1^n.$$ It follows that the required probability is $$1 -\frac{4\cdot 3^n -6\cdot 2^n +4\cdot 1^n}{4^n}.$$

(b) We want the smallest integer $n \ge 4$ such that $$\frac{4\cdot 3^n -6\cdot 2^n +4\cdot 1^n}{4^n} \lt 0.1.$$ The dominant term is $(4\cdot 3^n)/4^n$. A little calculator work shows that the smallest $n$ that works is given by $n=13$.