Solved – Stochastic inequality with product of binomial distributions

binomial distributionprobability

I am interested in modeling the following experiment:

A binomial trial with $n$ Bernoulli experiments is run.
For the k positive outcomes a second (independent) binomial trial with $k$ runs with a different success probability is run.

For example: I throw $n$ darts with probability $p_1$ of hitting the bull's eye. The number of hits is $k$. My friend is then allowed to throw $k$ times and has a different skill set than I do (probability $p_2$). I want to model the probability that he will hit the bull's eye $l$ times.
Let us call the random variable $Y_{n}$ if n trials are run. The first binomial trial is $X^{1}_{n}$ and the second is $X^{2}_{k}$.

Question 1:

What is the distribution of this random variable?

I argue that an ugly form is:

$P \left(Y_{n}=l\right) = \sum_{i=l}^{n}P\left(X_{n}^{1}=i\right)\cdot P\left(X^{2}_i = l\right)$.

Is this correct and is there a nice way to determine confidence intervals for the success probability $p$ of $Y$? I am especially concerned with the quality of the confidence intervals for small $n$ so I am not feeling well with using a normal approximation which would I guess lead to a sum of product normal probabilities. Using Jeffrey's prior (which afaik leads to better results for smaller $n$) I would with a wild guess receive a sum of products of beta probabilities which seems to be numerically difficult (Product of beta distributions).

Question 2:

Finally I am interested for $Y^{i}\sim D \left(n_{i},p_{i}\right)$, with $D$ being the (for me) unknown distribution with overall success probability $p_{i}$ to calculate (approximately)

$P\left(p^{1}_{n_1}\gt p^2_{n_2}\right)$ given sample data. Is there a general way to attack this problem?

Interpretation: there are two teams of dart player pairs. I observe a series of outcomes and want to calculate the probability that one team has a higher sucess probability (success being hitting the bull's eye in the second step which requires at least one hit in the first step).

Question 3:

Is there a nice generalization if I have not only 2, but $k$ successive binomial experiments of this kind? Nice in the sense not simply extending my formula above and summing/integrating approximately.

Best Answer

Imagine a Binomial trial of size one with success probability $p_1$. Conditional on success, another trial of size one with success probability $p_2$ is run independently. Because independent probabilities multiply, the chance of two successes is $p_1p_2$. Thus this two-step procedure is the same as a one-step procedure in which "success" is declared with probability $p_1p_2$. Your experimental outcome $m$ (I use this instead of "$l$" which is easily misread) is the number of such pairs of successes out of $n$ independent replications of this two-step procedure, which makes it a Binomial experiment of size $n$ with success probability $p_1p_2$.

Mathematically this implies the theorem

$$\sum _{k=m}^n {p_1}^k {p_2}^m \binom{k}{m} \binom{n}{k} (1-{p_2})^{k-m} (1-{p_1})^{n-k} = \binom{n}{m} ({p_1} {p_2})^m (1-{p_1} {p_2})^{n-m}$$

which can be checked formally.

It should be clear how this generalizes.

In terms of the darts analogy, I am arguing that if you let your friend throw one dart each time you hit in the mark, then--although the sequence in which your throws alternates with her throws is different--in the end it's the same thing, because you have thrown $n$ times and your friend has thrown as many times as you had successes. At each one of your throws your friend will hit in the mark with probability $p_1p_2$: that's a Binomial experiment.

Related Question