Is there a simple test for uniform distributions?

You could use a chi square test. Basically, this tests whether the number of draws that fall into various intervals is consistent with a uniform random distribution. See Knuth's TAOCP volume 2, Seminumerical Algorithms for an explanation of the chi-square test.

Update: If the chi square test is too complicated, you could plot a histogram of your output and eyeball it to see if it looks uniform. But if you'd like to quantify whether it's even enough, that's exactly what a chi square test is. The test will also tell you whether the data are too evenly spread out, i.e. not random enough.


You can use the Kolmogorov Smirnov test too. It is a non parametric test, and will work on many distributions - including Uniform. The advantage it has over other tests is that it looks at the whole distribution. Also it is fairly easy to use.

The idea is to compare the CDFs of the $H_0$ distribution with the observed empirical distribution, and find the maximum difference. This difference does not depend on the distribution (hence it is non-parametric), and depends only on $n$, the number of observations you are drawing. Then you look it up against a table to see how different it is from expected range of the difference.

Empirical CDF of your observations $\\{x_0,x_1,...,x_n\\}$ is $F_n(x)=\frac{1}{n}\sum_1^nI_{x_i\lt x}$. Then your test statistic will be $D_n=\sup|F(x)-F_n(x)|$. Assuming you sort your $x_n$'s in ascending order, and assuming your numbers come from Uniform[0,1] (wlg since you can scale them appropriately) it will be $D_n=\max_i(max(|x_i-\frac{i}{n}|,|x_i-\frac{i-1}{n}))$.

Once you compute that, it's only a matter of looking up table or using the right software. I recommend this site which has an excellent tool to compute KS along with many other tests. For KS, you need to select "Dn", put in the value you got, press the button, and assume interpret the result as it is truly uniform if P (computed in the form) < $\alpha$.


One issue with chi-square goodness-of-fit test is, because it operates on the empirical PDF and not CDF, the outcome depends on the bin size (class interval width) that you choose: if your data actually comes from a non-uniform distribution, a wider bin will create a bigger chance to reach the wrong conclusion (imagine the extreme of taking just one bin for all your data: perfectly rectangular histogram!). Also, if your bin counts are below 5 (as a rule), you need to combine bins together, which will also affect the outcome of the test. CDFs do not use binning and hence do not suffer from this problem. Kolmogorov-Smirnov test for CDF is rather crude, because it considers only one data point (= that of maximum deviation). A better one is Cramer-von Mises test, similar to K-S but instead integrates the differences between theoretical and empirical CDFs across the entire data range. There are other tests (Anderson-Darling, Shapiro-Wilks, etc.)

Tags:

Statistics