How is this a binomial distribution when the trials are not independent

binomial distributionprobabilityprobability theory

I was trying to find solutions for this question:

Twenty percent of all telephones of a certain type are submitted for service
while under warranty. Of these, 60% can be repaired, whereas the other 40% must
be replaced with new units. If a company purchases 10 of these telephones, 
what is the probability that exactly two will end up being 
replaced under warranty?

After searching around I found this: solution. It uses the binomial formula to calculate P(X = 2).

A book on probability by Walpole states that the different Bernoulli trials of a Bernoulli process must be independent. But in this scenario, we're choosing 10 telephones from a lot and seeing if its defective. In essence, our Bernoulli trial is to purchase a telephone from the lot and the Bernoulli process is 10 such repeated trials.

But once we perform the trial once, the total quantity and either the quantity for defective pieces or non-defective pieces are changed. In essence, non-independent trials.

  1. How can we, then, apply the Binomial distribution here?
  2. Let's assume
    you & I have no knowledge of the Binomial formula. What would be a
    more intuitive way of solving this problem through basic probability
    notions?

Best Answer

From the comments with some additions:

Your question boils down to the relationship between the binomial and hypergeometric distributions. Essentially the binomial distribution is the limit of the hypergeometric distribution when the number of successes and the number of failures in the whole population are both much larger than the sample size. In this case the probability that a draw will be a success or failure is only very weakly dependent on the previous outcomes. In this example this approximation would be quite accurate if, say, there are 10000 telephones out there and we are just examining 10 of them. In this case for example the probability that all 10 of them get replaced is $\frac{800}{10000} \cdot \frac{799}{9999} \cdot \dots \cdot \frac{791}{9991} \approx 1.01 \cdot 10^{-11}$ which is pretty close to $0.08^{10} \approx 1.07 \cdot 10^{-11}$.

Basically it is about intuition of the problem at hand: presumably there are way more than ~100 phones out there from this company.

That said, the caveat that I pointed out that both the number of successes and the number of failures need to be much larger than the sample size can be important, if you're considering an event which has very low or very high probability. Indeed in the example above, the relative error is about 5%, which happened because 800 is not that much larger than 10.