[Math] Probability Question for Defective Bulbs (Verification)

binomial distributionprobabilityprobability theoryproof-verificationstatistics

A company manufactures bulbs. The probability that a randomly selected bulb, after it is shipped to a customer, is defective equals $0.01$

A) In a shipment of $100$ bulbs, what's the probability that the total number of defective bulbs received does not exceed $2$?

B) Let $X$ denote the number of defective bulbs in a shipment of $100$ bulbs. Write the probability mass function of $X$.

Bonus Question: What distribution does $X$ follow?

C) Suppose you receive a shipment of $100$ bulbs from this company, and you test the bulbs, one at a time, picked randomly and independently of one another, until you find a functional (non defective) bulb or you run out of bulbs. Let $Y$ denote the number of bulbs you will end up testing. Write the probability mass function of $Y$.

This is what I have so far (I'm not sure if what I did is right, please help):

enter image description here

Best Answer

Your calculation for part (A) is correct.

Your work for part (B) is essentially correct, but there are two issues: first, your notation is nonstandard and a bit incomplete: one should more properly write, for example, $$\Pr[X = x] = \binom{100}{x} p^x (1-p)^{100-x}, \quad x \in \{0, 1, 2, \ldots, 100\}.$$ Second, since you are given $p = 0.01$, it is best that you substitute the value in, giving $$\Pr[X = x] = \binom{100}{x} (0.01)^x (0.99)^{100-x}, \quad x \in \{0, 1, 2, \ldots, 100\}.$$ The same applies when we write $$X \sim \operatorname{Binomial}(n = 100, p = 0.01).$$ If you are not familiar with the binomial coefficient, there are different notations for it: $$\binom{n}{k} = {}_n C_k = C^n_k = C(n,k) = C_{n,k}$$ are all equivalent ways of writing it.

Your calculation for part (C) is incorrect. You are not asked for an expected value. You want the probability that, for a fixed $y \in \{1, 2, \ldots, 100\}$, your testing stops at bulb $y$. So your answer will be a function $f_Y(y)$ of $y$, and it is a mass function satisfying $$\sum_{y=1}^{100} f_Y(y) = 1.$$

To this end, let's work out an example: What is the probability that you stop testing immediately after the first bulb is tested? Remember, you stop once you find a good bulb, or you've tested all the bulbs. So, stopping after testing the first bulb means you tested it and it was good. If $Y$ is a random variable that represents the number of bulbs tested in this sequential manner, then clearly $$\Pr[Y = 1] = 1 - p = 0.99.$$

Now, what is the probability that you stop immediately after the second bulb is tested? This amounts to testing the first bulb and finding it is defective (or else you would have stopped!), then testing the second and finding it is good. So this is $$\Pr[Y = 2] = p(1-p) = (0.01)(0.99).$$ Extending this reasoning further, we can see that $$\Pr[Y = y] = p^{y-1}(1-p) = (0.01)^{y-1}(0.99), \quad y \in \{1, 2, \ldots, 99\}.$$ This is because in order to stop testing right after the $y^{\rm th}$ bulb, you'd have to see $y-1$ defective bulbs, and then the final $y^{\rm th}$ bulb must be good. But why is this formula true only for $y$ up to $99$? Why not $100$? Because $$\Pr[Y = 100] = p^{100} = (0.01)^{100}.$$ Once $100$ bulbs are tested, there isn't a $101^{\rm st}$ bulb to test--you've exhausted your supply, and although you've not found a good bulb, there are no more to test.

This distribution for $Y$ is what we might call an "upper-modified geometric distribution," something akin to zero-modified distributions.