Solved – Density function for a multivariate Bernoulli-like distribution

distributions

I'm looking for a distribution to model a vector of $k$ binary random variables, $X_1, \ldots, X_k$. Suppose I have observed that $\sum_i X_i = n$. In this case I do not want to treat them as independent Bernoulli random variables. Instead, I would like something like the multinomial:

$P(X_1=x_1, \ldots, X_k=x_k) = f(x_1, \ldots, x_k; n, p_1, \ldots, p_k) = \frac{n!}{x_1! \cdots x_k!} \prod_{i=1}^k p_i^{x_i}$

but instead of the $x_i$ being nonnegative integers, I want them restricted to be either 0 or 1. I have been trying to see if the multivariate hypergeometric is appropriate, but I'm not sure.

Thanks in advance for any advice.

Best Answer

The appropriate distribution is Wallenius's noncentral hypergeometric distribution. Using an urn analogy, the problem is equivalent to picking $n$ of $k$ balls without replacement, where each ball is a different color: the parameters $p$ are analogous to the weights of picking a particular color.

The problem: it's not very convenient to work with, though there is an R package.

Related Question