[Math] What’s the difference between expected values in binomial distributions and hypergeometric distributions

probabilityprobability distributionsprobability theorystochastic-processes

The formula for the expected value in a binomial distribution is:

$$E(X) = nP(s)$$
where $n$ is the number of trials and $P(s)$ is the probability of success.

The formula for the expected value in a hypergeometric distribution is:

$$E(X) = \frac{ns}{N}$$
where $N$ is the population size, $s$ is the number of successes available in the population and $n$ is the number of trials.

$$E(x) = \left( \frac{s}{N} \right)n $$
$$P(s) = \frac{s}{N}$$
$$\implies E(x) = nP(s)$$

Why do both the distributions have the same expected value? Why doesn't the independence of the events have any effect on expected value?

Best Answer

For either one, let $X_i=1$ if there is a success on the $i$-th trial, and $X_i=0$ otherwise. Then $$X=X_1+X_2+\cdots+X_n,$$ and therefore by the linearity of expectation $$E(X)=E(X_1)+E(X_2)+\cdots +E(X_n)=nE(X_1). \tag{1}$$ Note that linearity of expectation does not require independence.

In the hypergeometric case, $\Pr(X_i=1)=\frac{s}{N}$, where $s$ is the number of "good" objects among the $N$. This is because any object is just as likely to be the $i$-th one chosen as any other. So $E(X_i)=1\cdot\frac{s}{N}+0\cdot \frac{N-s}{N}=\frac{s}{N}$. It follows that $E(X)=n\frac{s}{N}$.

Essentially the same proof works for the binomial distribution: both expectations follow from Formula $(1)$.

Related Question