Random Generation – How Does Numpy Generate Samples from a Beta Distribution?

distributionsnumpyrandom-generation

numpy lets you generate random samples from a beta distribution (or any other arbitrary distribution) with this API:

samples = np.random.beta(a,b, size=1000)

What is this doing beneath the hood? Inverse Transform Sampling?

Best Answer

The code for numpy.random.beta is found at legacy-distributions.c at the time of this writing.

  1. When $a$ and $b$ are both 1 or less, then Jöhnk's beta generator is used (see page 418 of Non-Uniform Random Variate Generation), with a modification to avoid divisions by zero.
  2. Otherwise, it uses the formula $X/(X+Y)$ where $X$ and $Y$ are gamma($a$) and gamma($b$), respectively, generated by the legacy gamma generator. (In general, the gamma generator uses Marsaglia and Tsang's algorithm [2000] unless the parameter is less than 1.)

Notice that numpy.random.* functions (including numpy.random.beta) became legacy functions since NumPy 1.17 introduced a new system for pseudorandom number generation (see the NumPy RNG Policy). Thus, the implementation of numpy.random.beta is not expected to change for as long as numpy.random.* functions are still present in NumPy, and the beta generator used in the new RNG system may differ from the one presented here.

In fact, at the time of this writing, the new generator is slightly different from the legacy generator (see distributions.c at the time of this writing), and one of the reasons for introducing the new RNG system is to allow the non-uniform random generators to be improved without having to maintain backward compatibility.

REFERENCES:

  • Marsaglia, G., Tsang, W.W., "A simple method for generating gamma variables", ACM Transactions on Mathematical Software 26(3), 2000.
Related Question