Solved – How to generate numbers based on an arbitrary discrete distribution

distributions

How do I generate numbers based on an arbitrary discrete distribution?

For example, I have a set of numbers that I want to generate. Say they are labelled from 1-3 as follows.

1: 4%, 2: 50%, 3: 46%

Basically, the percentages are probabilities that they will appear in the output from the random number generator. I have a pesudorandom number generator that will generate a uniform distribution in the interval [0, 1]. Is there any way of doing this?

There are no bounds on how many elements I can have, but the % will add up to 100%.

Best Answer

One of the best algorithms for sampling from a discrete distribution is the alias method.

The alias method (efficiently) precomputes a two-dimensional data structure to partition a rectangle into areas proportional to the probabilities.

Figure

In this schematic from the referenced site, a rectangle of unit height has been partitioned into four kinds of regions--as differentiated by color--in the proportions $1/2$, $1/3$, $1/12$, and $1/12$, in order to sample repeatedly from a discrete distribution with these probabilities. The vertical strips have a constant (unit) width. Each is divided into just one or two pieces. The identities of the pieces and the locations of the vertical divisions are stored in tables accessible via the column index.

The table can be sampled in two simple steps (one for each coordinate) requiring generating just two independent uniform values and $O(1)$ calculation. This improves on the $O(\log(n))$ computation needed to invert the discrete CDF as described in other replies here.