Solved – What are the advantages of an exponential random generator using the method of Ahrens and Dieter (1972) rather than by inverse transform

My question is inspired by R's built-in exponential random number
generator, the function rexp().
When trying to generate exponentially distributed random numbers,
many textbooks recommend the inverse transform method as outlined in this Wikipedia page.
I am aware that there are other methods to accomplish this task. In
particular, R's source code uses the algorithm
outlined in a paper by Ahrens & Dieter (1972).

I have convinced myself that the Ahrens-Dieter (AD) method is
correct. Still, I do not see the benefit of using their method
compared to the inverse transform (IT) method. AD is not only more
complex to implement than IT. There does not seem to be a speed
benefit either. Here is my R code to benchmark both methods
followed by the results.

invTrans <- function(n)
    -log(runif(n))
print("For the inverse transform:")
print(system.time(invTrans(1e8)))
print("For the Ahrens-Dieter algorithm:")
print(system.time(rexp(1e8)))

Results:

[1] "For the inverse transform:" 
user     system     elapsed
4.227    0.266      4.597 
[1] "For the Ahrens-Dieter algorithm:"
user     system     elapsed
4.919    0.265      5.213

Comparing the code for the two methods, AD draws at least two
uniform random numbers (with the C function unif_rand()) to
get one exponential random number. IT only needs one uniform
random number.
Presumably the R core team decided against implementing IT
because it assumed that taking the logarithm may be slower than
generating more uniform random numbers. I understand that the
speed of taking logarithms can be machine-dependent, but at least
for me the opposite is true.
Perhaps there are issues around IT’s numerical precision having to
do with the singularity of the logarithm at 0? But then, the R
source code sexp.c reveals that the implementation of AD also loses
some numerical precision because the following portion of C code
removes the leading bits from the uniform random number u.

double u = unif_rand();
while(u <= 0. || u >= 1.) u = unif_rand();
for (;;) {
    u += u;
    if (u > 1.)
        break;
    a += q[0];
}
u -= 1.;

u is later recycled as a uniform random number in the remainder
of sexp.c.
So far, it appears as if

IT is easier to code,
IT is faster, and
both IT and AD possibly lose numerical accuracy.

I would really appreciate if someone could explain why R still
implements AD as the only available option for rexp().

> print(system.time(rexp(1e8))) utilisateur système écoulé 4.617 0.320 4.935 > print(system.time(rexp(1e8))) utilisateur système écoulé 4.589 2.045 6.629 > print(system.time(-log(runif(1e8)))) utilisateur système écoulé 7.455 1.080 8.528 > print(system.time(-log(runif(1e8)))) utilisateur système écoulé 9.140 1.489 10.623

> print(system.time(rexp(1e8,rate=.01))) utilisateur système écoulé 4.594 0.456 5.047 > print(system.time(rexp(1e8,rate=.01))) utilisateur système écoulé 4.661 1.319 5.976 > print(system.time(-log(runif(1e8))/.01)) utilisateur système écoulé 15.675 2.139 17.803 > print(system.time(-log(runif(1e8))/.01)) utilisateur système écoulé 7.863 1.122 8.977 > print(system.time(rexp(1e8,rate=101.01))) utilisateur système écoulé 4.610 0.220 4.826 > print(system.time(rexp(1e8,rate=101.01))) utilisateur système écoulé 4.621 0.156 4.774 > print(system.time(-log(runif(1e8))/101.01)) utilisateur système écoulé 7.858 0.965 8.819 > > print(system.time(-log(runif(1e8))/101.01)) utilisateur système écoulé 13.924 1.345 15.262

> benchmark(x=rexp(1e6,rate=101.01)) elapsed user.self sys.self 4.617 4.564 0.056 > benchmark(x=-log(runif(1e6))/101.01) elapsed user.self sys.self 14.749 14.571 0.184 > benchmark(x=rgamma(1e6,shape=1,rate=101.01)) elapsed user.self sys.self 14.421 14.362 0.063 > benchmark(x=rexp(1e6,rate=.01)) elapsed user.self sys.self 9.414 9.281 0.136 > benchmark(x=-log(runif(1e6))/.01) elapsed user.self sys.self 7.953 7.866 0.092 > benchmark(x=rgamma(1e6,shape=1,rate=.01)) elapsed user.self sys.self 26.69 26.649 0.056

Best Answer

On my computer (pardon my French!):

the inverse transform does worse. But you should watch out for variability. Introducing a rate parameter leads to even more variability for the inverse transform:

Here are the comparisons using rbenchmark:

So mileage still varies, depending on the scale!

Best Answer

Related Solutions

Solved – Generating random samples from a given distribution

Solved – Transform standard normally distributed numbers into arbitrary normal distribution

Related Question