Prove a random generated number is uniform distributed

There is no way to prove it, because the generator might first generate a uniform distribution and later deviate into a non-uniform one.


To prove it, you need to know the algorithm being used and show in graph terms that the set of all states constitutes a cycle, that there are no subcycles, and that the cardinality of the state space modulo N is zero so that there is no set of states that occur more/less frequently than others. This is how we know that Mersenne Twister, for instance, is uniformly distributed even though the 64 bit version has a cycle length of 219937-1 and could never be enumerated within the lifetime of the universe.

Otherwise you use statistical tests to test the hypothesis of uniformity. Statistics can't prove a result, it fails to disprove the hypothesis. The larger your sample size is, the more compelling the failure to disprove a hypothesis is, but it is never proof. (This perspective causes more communications problems with non-statisticians/non-scientists than anything else I know.) There are many tests for uniformity, including chi-square tests, Anderson-Darling, and Kolmogorov-Smirnov to name just a few.

All of the uniformity tests will pass sequences of values such as 0,1,2,...,N-1,0,1,... so uniformity is not sufficient to say you have a good generator. You should also be testing for serial correlation with tests such as spacings tests, runs-up/runs-down, runs above/below the mean, "birthday" tests, and so on.

A pretty comprehensive suite of tests for uniformity and serial correlation was created by George Marsaglia over the course of his career, and published in 1995 as what he jokingly called the "Diehard tests" (because it's a heavy duty battery of tests).


For black-box testing (you dont have access to the source code), you can't prove it is uniformly distributed (UD). You can, however, perform statistical tests to find the likelihood of it being UD. Run the generator many times (say, N*X times) and each number between 0 and N should have appeared around X times.

This completely ignores whether it's random numbers or not, it just focuses on uniformity. However, it would only prove that the generator was uniformly distributed if you were to run infinite tests. At best, you have a probability of the generator being uniform during the first N*X iterations, but it is simple and easy to implement.