Solved – Distribution of number of Bernoulli trials to a given number of successes

binomial distributiondistributionsprobability

Suppose you have a series of n trials, where the probability of success
in each trial is p. The distribution of the number of successful trials
follows a Binomial distribution with parameters (n, p). The mean is given
by np whereas the variance is np(1-p). So far so good: this is pretty
mundane Stats 101 stuff.

But suppose now that I only knew about the successful trials, and had no
knowledge of the total number n of trials, which is the variable I am
interested in estimating. For example, I knew I had 100 successful trials,
where each trial had a 0.1 chance of success. Is there a known probability
distribution that describes the probable outcomes for n, the total number
of trials? Estimating the mean is easy: if m is the number of successes,
then it's just m/p. But what about variance and other measures?

What if each success had a different (but known) chance of success? Suppose I
had the following records:

  • success1 (with p=0.1)
  • success2 (with p=0.1)
  • success3 (with p=0.2)

Again, a good estimation of the total number of trials can be obtained by
simply summing 1/p for each successful trial. In this case that number
is 10+10+5=25. But what about variance and other measures?

Best Answer

Look at the Negative binomial distribution. The usual way of stating the negative binomial is the number of failures before you see x successes, but if you add the number of failures and the number of success then you get the total n which you are asking about.

The negative binomial does assume that the last observed value (when you stopped) is a success, this may match with what you want. But if you only know the number of success and not if there were any more failures after the last success then you will need to expand this.

If the negative binomial does not answer the question then you may want to try a Baysian approach, choose a prior for n, hold the others constant (or give them priors as well) and apply Bayes theorem.