[Math] Weighted hypergeometric distribution

probability distributions

The probablility of $M$ successes out of $N$ draws without replacement from a population of $S$ success and $F$ failure states, assuming that each remaining state is equally likely, is given by

$\dfrac{\binom{S}{M} \binom{F}{N-M}}{\binom{S+F}{N}}$.

We can think of it as drawing $N$ marbles from an urn containing $S$ white and $B$ black marbles, and winding up with $M$ out of those marbles being white. This is known as hypergeometric distribution.

The assumption that each remaining state is equally likely is to say that for a single draw from a population of $S$ success and $F$ failure states, the probability of success is $\frac{S}{S+F}$.

Now my problem: assume that the success states have a weight $W$, which can be either larger of smaller than $1$. In a population of $S$ success and $F$ failure states, the probability of drawing a success state is $\frac{WS}{WS + F}$. In the urn model, this can mean that one color of marbles is easier to get a grip on than the other.

I can certainly manage to find a probability mass function, but my expression gets a rather complex sum in the denominator.

Does this distribution have a name that can help me find known properties for it? My primary interest is finding an efficient algorithm for computing the probability mass function and the cumulative distribution. So, for instance, I would be interested in a mass function expression that is as simple as possible. Approximations for different parameter settings are also of interest.

Best Answer

The non-central hypergeometric distributions fit the bill exactly.

However, the subtlety is that with the weighted drawing is that the distribution of white balls in the sample depends on the sampling policy, unlike with the hypergeometric distribution. If $k$ balls are sampled one after another one in a competitive manner, this is known as the Wallenius' non-central hypergeometric distribution, whereas if $k$ balls as sampled independently, i.e. at once, the distribution is known as the Fisher's non-central hypergeometric distribution.

Of course, both distributions agree for $k=1$. Say, using Mathematica:

In[236]:= With[{s = 3, f = 5}, 
 PDF[FisherHypergeometricDistribution[1, s, s + f, w], 1]]

Out[236]= (3 w)/(5 + 3 w)

In[238]:= 
With[{s = 3, f = 5}, 
  PDF[WalleniusHypergeometricDistribution[1, s, s + f, w], 
   1]] // Simplify

Out[238]= (3 w)/(5 + 3 w)
Related Question