[Math] Generating geometrically-distributed variates over an interval given endpoints and expected value

algorithmsrandom

I want to generate random variates from a truncated geometric distribution over the interval $[0, n)$ with specified expected value, $0 \le E < n$. The obvious way to do this seems to be to sample the exponential distribution over the same interval and round down to the nearest integer, but I've run into some technical difficulties, in (what I believe to be) decreasing order of seriousness:

  1. I don't know how to truncate the distribution except by rejection sampling, but I cannot do that, because the surrounding code requires a constant-time operation.

  2. I'm not confident that rounding an exponential variate down to the nearest integer gives the proper distribution of integers.

  3. The obvious way to generate an exponential variate over $[0, n)$ in constant time is inverse transform sampling: $T = -E \log U$ where $E$ is the expected value and $U$ is a uniform variate over $(0, 1]$. However, on a real computer (using IEEE doubles for all calculations), this will not generate variates larger than $744.44E$, because $U$ cannot be smaller than approximately $5\times 10^{-324}$. In my application, the desired upper limit $n$ might be quite large. I'm not sure this is a problem, since variates larger than $744.44E$ should appear less than once in $2 \times 10^{323}$ trials, but I can't convince myself it's genuinely not a problem.

  4. When $E=0$, the above equation always evaluates to zero. Is that the correct behavior?

Best Answer

I'm assuming that $E$ is the expected value of the full geometric distribution, not of the truncated one. You've used $E$ for the expected value of both the geometric distribution and the exponential distribution; since they need to be slightly different, I'll use $E'$ for the expected value of the geometric distribution.

  1. You can truncate the distribution by restricting $U$ to a range such that $T\in[0,n)$; that is, $U$ should be uniformly distributed in $(\mathrm e^{-n/E},1]$.

  2. It does if you choose the right parameter. The area under each interval of length $1$ of the continuous distribution is larger by a factor of $\mathrm e^{-1/E}$ than the one to the right, so the expected value of the geometric distribution you get will be $E'=\mathrm e^{-1/E}/(1-\mathrm e^{-1/E})=1/(\mathrm e^{1/E}-1)$. Thus, given $E'$, you can determine $E$ as $E=1/\log(1+1/E')$.

  3. As you wrote, this is quite unlikely to be a problem in your lifetime or anyone else's.

  4. Yes. An exponential distribution can only have expected value $0$ if it's degenerate, a delta distribution at the origin.

Related Question