Solution verification: 2 simple problems involving Conditional Probability and Hypergeometric Distribution

combinatoricsconditional probabilityprobabilitysolution-verificationstatistics

Q1) An experiment is carried out to investigate how well 100 nano-particles disperse uniformly, when introduced into a container of 900 other particles of similar size and shape.
After the nano-particles are added at the top of the container, the researchers wait 10
hours before randomly selecting a sample of 20 particles without replacement from the
bottom of the container.

a) What is the probability that the sample will contain 10 nano-particles?

b) What assumption did you need to make to answer part (a)?


Q2 Player A who enters a golf tournament is not certain whether player B will enter.
Player A has probability 1=6 of winning the tournament if player B enters and probability
3=4 of winning if player B does not enter the tournament. The probability that player B
enters is 1=3. On Monday morning you read that player A won the tournament. What
is the probability that player B entered the tournament?


My Work:

Q1 (a): $P(\text{Getting 10 nano particles})= \frac{\binom{100}{10} \binom{900}{10}}{\binom{1000}{20}}$

Q1 (b) The assumption that we made is that all subsets of size 20 i.e $\binom{1000}{20} $ are equally likely to be chosen.


For Q2) We have

$P(A_w|B)= \frac{P(A_w \cap B)}{P(B)}=\frac{1}{6}$

$P(A_w|B^c)= \frac{P(A_w \cap B^c)}{P(B^c)}= \frac{3}{4}$

$P(B)= \frac{1}{3}$

We need to find $ P(B | A_w)$.

$ P(B | A_w)= \frac{P(B\cap A_w)}{P(A_w)}$

$P(B \cap A_w) = \frac{1}{6} \text{ x } \frac{1}{3}= \frac{1}{18}$

$P(A_w)= P(A_w\cap B) + P(A_w \cap B^c) = \frac{1}{6}\text{ x } \frac{1}{3} + \frac{3}{4}\text{ x } \frac{2}{3} = \frac{5}{9}$

Then $P(B | A_w)= \frac{\frac{1}{18}}{\frac{5}{9}}= \frac{1}{10}$ $\text { (Final answer)}$

Best Answer

(1) The probability of getting exactly 10 of the particles newly introduced is quite small: 4.659696e-06 $\approx$ 0.000004660.

dhyper(10, 100,900,  20)
[1] 4.659696e-06
choose(100,10)*choose(900,10)/choose(1000,20)
[1] 4.659696e-06

It is unclear why one would expect to get 10 new particles out of 20, if the new particles are randomly dispersed. Recapturing about 2 in 20 (at the bottom) seems most likely. Here is a plot of the relevant hypergeometric distribution, for the number of new particles sampled.

x = 0:20; pdf=dhyper(x, 100,900, 20)
sum(pdf)
[1] 1
plot(x, pdf, type="h", lwd=2)
abline(h=0, col="green2")

enter image description here

Getting 5 or more of the new particles out of 20 would lead to rejection (5% level) of the null hypothesis that particles are randomly mixed. Getting 10 or more would be extremely strong evidence against random mixing (P-value 5.121643e-06.)

qhyper(.95, 100, 900, 20)
[1] 4
phyper(4, 100, 900, 20)
[1] 0.9585121
1 - phyper(9, 100, 900, 20)
[1] 5.121643e-06

Note: In R, dhyper, phyper, and qhyper designate hypergeometric PDF, CDF, and inverse CDF (quantile function), respectively.

Adddendum per question in comment: In mathematical terms, your answer to 1(b) is OK. In practical terms, you are assuming that the new particles have become randomly mised wit the others.

If the P-value is smaller than 5% then you reject the null hypothesis at the 5% level.

When you use printed normal tables you are doing a CDF lookup or an inverse CDF lookup. If you look for z-score 1.96 in the margins of the table and find probability corresponding to 0.9750 in the body of the table, that is a CDF look-up. If you scan the body of the table for the nearest probability to 0.0750 and use that to find z-score 1.96 in the margins, that is an inverse CDF look-up. Om R. it would be:

pnorm(1.96)
[1] 0.9750021
qnorm(.975)
[1] 1.959964
Related Question