The Park-Miller random number generator

computational mathematicscomputer scienceelementary-number-theory

I'm looking into the math behind a pseudorandom number generator called the Park-Miller random number generator, defined by the recurrence relation $X_{n+1}=aX_n\bmod{m}$. $a$ is known as the multiplier, and a common value for it is $48271$. $m$ is known as the modulus, and is usualy a Mersenne prime. A common value for it is $2^{31}-1$.
I have 3 questions, namely:

  1. Everywhere I've seen this explained, it seems like the $a$ parameter gives the largest period if it is what's called a "primitive $n$th root of unity", which when I hear that I think of the roots used in Fourier transforms. The definition involves the totient function $\varphi(n)$, which I do understand, but I need some clarification as to what these special numbers are. I'm not really a number theorist, just curious.
  2. Why exactly does this give random-looking numbers? I think it has something to do with the choice of modulus $m$ and multiplier $a$, but why exactly do these determine the period, and how so?
  3. When I was doing my research, I came across a particular set of parameters: $a=65539$ and $m=2^{31}$. These are used in a variant of this algorithm called RANDU, which was used a lot in the 60s and 70s, and is nowadays widely considered as one of the worst pseudorandom number generators ever invented. Why is this, and how can one determine how "good" a pseudorandom number generator is?

I hope this post isn't a repeat of something else.

Best Answer

This algorithm will, by its nature, be periodic, which means that there will be some value $k$ such that $X_{n + k} = X_n$ for all $n$. To see why, notice that there are $m$ possible outputs, so after $m + 1$ iterations there must be at least one pair of identical values produced, and as soon as you get one repeat then the entire sequence has to repeat itself (since if $X_{n + k} = X_n$ you also have $X_{n + k + 1} = X_{n + 1}$ and so on).

Since the algorithm always multiplies by the same value, we can actually write the $n$th output explicitly as $X_n = a^n X_0 \mod m$. You've mentioned the totient function, so now's a good time to mention Euler's theorem (aka "Euler's extension of Fermat's little theorem"):

  • If $a$ and $m$ are coprime (i.e. they share no common divisors), $a^{\varphi(m)} \equiv 1 \mod m$

If $p$ is prime, then this reduces to Fermat's little theorem, which states that $a^{p-1} \equiv 1 \mod p$ for $1 \leq a < p$. For example, $2^4 = 16 \equiv 1 \mod 5$.

However, notice that while $4^4 = 256 \equiv 1 \mod 5$, we also have $4^2 = 16 \equiv 1 \mod 5$, so clearly the theorem doesn't tell you the smallest power you can raise $a$ to to get back to 1. However, if $a^k \not\equiv 1 \mod m$ for $1 < k < \varphi(m)$, then we say that $a$ is a primitive root mod m, and those primitive roots are important because they give you the longest possible sequence of unique values in the algorithm.

For example, if we work mod 11 and take $a = 6$ and $X_0 = 1$, then we get the following outputs from the algorithm:

$n$ 0 1 2 3 4 5 6 7 8 9
$X_n$ 1 6 3 7 9 10 5 8 4 2

Notice that if we started with, say, $X_0 = 5$ we would get the same sequence, but starting at a different position.

Finding primitive roots is not necessarily difficult, although it might be a bit tedious depending on your modulus. You can prove pretty easily that every number's order modulo m (i.e. the smallest power of it that is congruent to 1) must be a factor of $\varphi(m)$, so to test if a number is a primitive root you just raise it to a bunch of different factors of $\varphi(m)$ until you're convinced that none of them give you 1 (technically, you only need to test $\varphi(m) / p_i$, where $p_i$ is a prime factor of $\varphi(m)$).

Why does this algorithm result in something resembling randomness? I won't go into the full details, but since we associate "randomness" with "unpredictability", one way to look at it is to look at what happens with slightly different starting values (or seeds). Taking the example from above but comparing $X_0 = 3$ and $X_0' = 4$:

$n$ 0 1 2 3 4 5 6 7 8
$X_n$ 3 7 9 10 5 8 4 2 1
$X_n'$ 4 2 1 6 3 7 9 10 5
$X_n' - X_n \mod m$ 1 6 3 7 9 10 5 8 4

As you can see, changing the seed by 1 results in vastly different results across the period of the algorithm. As a rough idea of how this happens, if you scaled everything down by $\frac{1}{m}$ so that all the numbers are between 0 and 1, think about what happens to the interval $[x, x + \delta]$ as it iterates - it's being multiplied by $a$ so at first it covers a range of $\delta$, then $a \delta$, then $a^2 \delta$, and so forth, but it gets wrapped back around past 0 and has many different overlaps with itself until finally you've iterated $\varphi(m)$ times and you're covering the whole interval many times over. That means that if someone has imperfect knowledge about the starting point (e.g. they know it was somewhere between $x$ and $x + \delta$) they become increasingly unable to guess where it will be several steps later unless they get to learn more about it.

At least in theory. But to answer your last question - why was RANDU bad? The answer is because what I just described is an indication that individual outputs from the RNG might be good enough to consider as random, but if you start combining consecutive outputs the flaws start to show up. As the Wikipedia article explains, if you take three consecutive draws from RANDU they will be related by $x_{n+2} = 6 x_{n+1} - 9 x_n$, which means they all lie on the same plane, or the same 15 planes when you take the modulus, which is not great if you want to sample uniformly across 3D space and only get points from the same 15 slices of your cube.

That's why modern RNG algorithms are subject to a large variety of tests to check, essentially, how easy it is to guess the next value it will output based on knowing something about its previous outputs, in addition to theoretical proofs that there aren't any such simple relations between values.

Related Question