Binomial and Beta Distributions – Understanding Their Relationship

beta distributionbeta-binomial distributionbinomial distribution

I'm more of a programmer than a statistician, so I hope this question isn't too naive.

It happens in sampling program executions at random times. If I take N=10 random-time samples of the program's state, I could see function Foo being executed on, for example, I=3 of those samples. I'm interested in what that tells me about the actual fraction of time F that Foo is in execution.

I understand that I is binomially distributed with mean F*N. I also know that, given I and N, F follows a beta distribution. In fact I've verified by program the relationship between those two distributions, which is

cdfBeta(I, N-I+1, F) + cdfBinomial(N, F, I-1) = 1

The problem is I don't have an intuitive feel for the relationship. I can't "picture" why it works.

EDIT: All the answers were challenging, especially @whuber's, which I still need to grok, but bringing in order statistics was very helpful. Nevertheless I've realized I should have asked a more basic question: Given I and N, what is the distribution for F? Everyone has pointed out that it's Beta, which I knew. I finally figured out from Wikipedia (Conjugate prior) that it appears to be Beta(I+1, N-I+1). After exploring it with a program, it appears to be the right answer. So, I would like to know if I'm wrong. And, I'm still confused about the relationship between the two cdfs shown above, why they sum to 1, and if they even have anything to do with what I really wanted to know.

Best Answer

Consider the order statistics $x_{[0]} \le x_{[1]} \le \cdots \le x_{[n]}$ of $n+1$ independent draws from a uniform distribution. Because order statistics have Beta distributions, the chance that $x_{[k]}$ does not exceed $p$ is given by the Beta integral

$$\Pr[x_{[k]} \le p] = \frac{1}{B(k+1, n-k+1)} \int_0^p{x^k(1-x)^{n-k}dx}.$$

(Why is this? Here is a non-rigorous but memorable demonstration. The chance that $x_{[k]}$ lies between $p$ and $p + dp$ is the chance that out of $n+1$ uniform values, $k$ of them lie between $0$ and $p$, at least one of them lies between $p$ and $p + dp$, and the remainder lie between $p + dp$ and $1$. To first order in the infinitesimal $dp$ we only need to consider the case where exactly one value (namely, $x_{[k]}$ itself) lies between $p$ and $p + dp$ and therefore $n - k$ values exceed $p + dp$. Because all values are independent and uniform, this probability is proportional to $p^k (dp) (1 - p - dp)^{n-k}$. To first order in $dp$ this equals $p^k(1-p)^{n-k}dp$, precisely the integrand of the Beta distribution. The term $\frac{1}{B(k+1, n-k+1)}$ can be computed directly from this argument as the multinomial coefficient ${n+1}\choose{k,1, n-k}$ or derived indirectly as the normalizing constant of the integral.)

By definition, the event $x_{[k]} \le p$ is that the $k+1^\text{st}$ value does not exceed $p$. Equivalently, at least $k+1$ of the values do not exceed $p$: this simple (and I hope obvious) assertion provides the intuition you seek. The probability of the equivalent statement is given by the Binomial distribution,

$$\Pr[\text{at least }k+1\text{ of the }x_i \le p] = \sum_{j=k+1}^{n+1}{{n+1}\choose{j}} p^j (1-p)^{n+1-j}.$$

In summary, the Beta integral breaks the calculation of an event into a series of calculations: finding at least $k+1$ values in the range $[0, p]$, whose probability we normally would compute with a Binomial cdf, is broken down into mutually exclusive cases where exactly $k$ values are in the range $[0, x]$ and 1 value is in the range $[x, x+dx]$ for all possible $x$, $0 \le x \lt p$, and $dx$ is an infinitesimal length. Summing over all such "windows" $[x, x+dx]$--that is, integrating--must give the same probability as the Binomial cdf.

alt text

Related Question