Gaussian random distribution in Postgresql
I was searching for a way to generate numbers according to a gaussian distribution and first found this post. This is why I share what I've found just after:
There is, since at least PostgreSQL 8.4, an additional module called tablefunc (http://www.postgresql.org/docs/current/static/tablefunc.html).
It proposes a function normal_rand(n, mean, stddev)
generating n pseudo-random numbers using a gaussian distribution (so this function returns a set of values, typically used in the FROM clause). However, if you set n to be 1, it can be used as a function returning a value and not a set of values.
Considering a table nb10 containing 10 records, the two following queries return a set of 10 pseudo-random numbers following a standard gaussian distribution (mean = 0, stddev = 1)
SELECT normal_rand(1, 0, 1) FROM nb10;
and
SELECT * from normal_rand(10, 0, 1);
I hope this could help anyone in the future ... :-)
To answer your question specifically, you could use something like:
SELECT floor(random_rand(1, 0, 1) * 250 + 125);
Unfortunately, it is possible to obtain an answer not in the range [0, 249] using this query. You could for example:
- use a recursive query, which I find a bit overkill, for discarding values not in the range
[0, 249]
, or - do your select into a loop into your host language, accepting the value only if its in the range
[0, 249]
, or use the modulo operator to remain in the
[0, 250[
range, I think this is the best solution, although it alternates slightly the gaussian curve. Here is the final query I suggest you use (the modulo/+/modulo tricks is because -x modulo y with x a positive number gives a negative number in PostgreSQL, which is not a bad thing :p):SELECT ((floor(normal_rand(1,0,1)*250 + 125)::int % 250) + 250) % 250 as v;
The sum of 12 uniform distributions on the range [0, 1) is a good approximation to a Gaussian distribution bounded in the range [0, 12). This can then easily be re-scaled by multiplying by a constant and then adding/subtracting a constant.
select
random() +
random() +
random() +
random() +
random() +
random() +
random() +
random() +
random() +
random() +
random() +
random();
http://books.google.com/books?id=EKA-yeX2GVgC&pg=PA185&lpg=PA185&dq=%22sum+of+12+uniform+random+variables%22&source=bl&ots=YfwwE0fBB3&sig=HX9J9Oe6x316kVL8uamDU_GOsn4&hl=en&sa=X&ei=bJLZUur1GozaqwGHm4DQDQ&ved=0CEUQ6AEwAw#v=onepage&q=%22sum%20of%2012%20uniform%20random%20variables%22&f=false