Solved – Verifying that a random generator outputs a uniform distribution

distributionshistogramhypothesis testingprobabilityuniform distribution

I asked a student of mine this question:

If you have a random number generator that outputs a number between
$1$ and $k$, how would you write a test that decides whether the
generated distribution is uniform ?

The student understood that we need to call the generator $n$ times and save a histogram $h_i$.

However I would expect the end test to be:
$$\sum{(h_i-\frac{n}{k})^2}<t$$
Where $t$ is a threshold that depends on $n$ and $k$

And what the student suggested is
$$1-t<\frac{h_i}{n/k}<1+t$$
Again, for some $t$ that is a function of $n$ and $k$

Is the student correct ?
Are those two ways equivalent ?

Best Answer

UPDATE 2: striked wrong points, and replaced some by stuff I think are correct.

UPDATE 1: added direct analysis of both your uniformity tests, and kept my old answer as a proposed superior uniformity test that I suggest.

Summary

  • Both you and your student have offered uniformity tests that work.
  • Both of your tests don't work well with continuous numbers (i.e. if RNG spits out numbers in $[1,k]$ instead of $\{1,2,\ldots,k\}$.
  • Since the output of your uniformity score is not a probability, I think it has the disadvantage of being difficult to interpret.
  • Since your student's uniformity score is a probability, I think it has the advantage of being easy to interpret.
  • While not harmful to the correctness of the test, your student's uniformity test has one aspect that is useless (but also harmless; just redundant), namely: there is not point to add $t$ on right side of the inequality as the output is a probability.
  • One small error in your student's uniformity test is using $<$ instead of $\le$ for the upper bound. I think he/she should have used $\le$.

On the other hand, I think my proposed test is:

  • Unlike your tests, mine works with continuous numbers.
  • Similar to your student's test, mine is also easy to interpret cause it's a probability as well.

The disadvantage is that my test is possibly harder to compute. But this is not a big deal as I think the correctness of the test is worth the slight increase in difficulty.

Analysing your uniformity tests

Your notation confused me a bit. I assume that you mean that $h$ is a histogram, and $h_i$ is the frequency that is associated with input $i$ where $i$ is some number between $1$ and $k$ (which your RNG produces).

So to rewrite your test, I think you wanted to say this: \begin{equation} \text{teacher uniformity score} = \sum_{i \in \{1,2,\ldots,k\}} (h_i - \frac{n}{k})^2 \end{equation}

Then the uniformity test: the RNG is uniformly distributed if the uniformity score is less than some threshold $t_{teacher}$ that we agree upon. I.e. output from RNG is uniformly distributed if the following statement is true: \begin{equation} \text{teacher uniformity score} < t_{teacher} \end{equation}

I see why your method makes sense. Basically, your score is essentially the sum of squared errors of observed frequencies against expected frequencies, where expected is what should happen if the RNG is perfectly uniform. So if the sum of squared errors is minimal, the more uniform it is.

So my opinion about your test is this:

  • Pros:
    • it does have merit, and it does reflect the degree of uniformity of the RNG.
  • Cons:
    • it is not easy to interpret because the output is not a probability. If you augment it such that the output is a probability, then it would be easier to interpret. At least I can't interpret it easily. This interpret-ability is a subjective thing.
    • it is not friendly with continuous RNGs. You can't use histograms with continuous numbers without approximating numbers into bins, which essentially deletes information and opens the vulnerability that might potentially lead to masking non-uniformity of some poor RNGs. We need to look at the probability density functions (PDFs) instead.

What you wrote about what your student claimed is also confusing, but I rewrite that to what I think is the closest thing that makes best sense in my view (kindly correct me if you think the student didn't say this): \begin{equation} \text{student uniformity score} = \sum_{i \in \{1,2,\ldots,k\}} \frac{h_i}{n/k} \end{equation}

This is essentially the ratio between observed histogram and expected histogram (expected is the perfect uniform one). Clearly, if the ratio is 1 then it's perfectly uniform. But if it isn't (i.e. greater or less than 1) then it is not perfectly uniform.

So your student is suggesting that the closer the ratio is to 1, the more random it is. So $t_{student}$ is essentially some kind of error term that accounts for deviations from 1: \begin{equation} 1-t_{student} \le \text{student uniformity score} \le 1+t_{student} \end{equation}

My opinion about your student's uniformity test is as follows:

  • Pros:
    • It does have merit for the same reason yours does.
    • It is easy to interpret (thanks to it being a probability). This is subjective though. I think most humans agree with me (correct me if you disagree).
  • Cons:
    • It's not easy to interpret just like yours.
    • It doesn't work well with continuous RNGs for the same reason yours doesn't.

My test which I think is superior to both of your tests

Based on the question, you only seem to be interested in finding whether a sequence is uniformly distributed. I.e. it doesn't need to satisfy any other property.

If your PRNG is outputting some float, e.g. something in $[0,1]$, I would personally suggest doing this (which seems somehow similar to your student's suggest? I don't know..):

  • Repeat PRNG $n$ times, where $n$ is large enough.
  • Estimate the true probability density function $f_X$. Let $\hat f_X$ be your best estimation.

Then your sequence is perfectly uniformly distributed if $\hat f_X(x)$ forms a line with a slope of 0. This is usually unachievable in reality.

So you need to define a degree of uniformity that, if satisfied, you subjectively declare that the PRNG is uniformly distributed.

To do this, I would personally suggest to define the null hypothesis:

  • A sequence is uniform if the probability of each number to appear is $1/n$.

You then compute the probability based on your empirical trials.

Finally, you do some variant of Fisher's exact test to find the probability that your sequence can exist if the null hypothesis is true.

Once you get that probability, it is there where you plug your threshold. You and your student need to agree on this threshold. It's subjective.

In this specific approach, I'd expect you to agree on a very large probability that is very close to $1$. Maybe $0.999$, or whatever you and your student seem to be happy with.

Or maybe do the opposite: define the null hypothesis as:

  • Your sequence is not uniform, and probability of each number to appear is not $1/n$.

Then, you need to choose a a very small threshold in order to reject it.

Related Question