Random number generation in C++11: how to generate, how does it work?
The question is way too broad for a complete answer, but let me cherry-pick a couple of interesting points:
Why "equally likely"
Suppose you have a simple random number generator that generate the numbers 0, 1, ..., 10 each with equal probability (think of this as the classic rand()
). Now you want a random number in the range 0, 1, 2, each with equal probability. Your knee-jerk reaction would be to take rand() % 3
. But wait, the remainders 0 and 1 occur more often than the remainder 2, so this isn't correct!
This is why we need proper distributions, which take a source of uniform random integers and turn them into our desired distribution, like Uniform[0,2]
in the example. Best to leave this to a good library!
Engines
Thus at the heart of all randomness is a good pseudo-random number generator that generates a sequence of numbers that uniformly distributed over a certain interval, and which ideally have a very long period. The standard implementation of rand()
isn't often the best, and thus it's good to have a choice. Linear-congruential and the Mersenne twister are two good choices (LG is actually often used by rand()
, too); again, it's good to let the library handle that.
How it works
Easy: first, set up an engine and seed it. The seed fully determines the entire sequence of "random" numbers, so a) use a different one (e.g. taken from /dev/urandom
) each time, and b) store the seed if you wish to recreate a sequence of random choices.
#include <random>
typedef std::mt19937 MyRNG; // the Mersenne Twister with a popular choice of parameters
uint32_t seed_val; // populate somehow
MyRNG rng; // e.g. keep one global instance (per thread)
void initialize()
{
rng.seed(seed_val);
}
Now we can create distributions:
std::uniform_int_distribution<uint32_t> uint_dist; // by default range [0, MAX]
std::uniform_int_distribution<uint32_t> uint_dist10(0,10); // range [0,10]
std::normal_distribution<double> normal_dist(mean, stddeviation); // N(mean, stddeviation)
...And use the engine to create random numbers!
while (true)
{
std::cout << uint_dist(rng) << " "
<< uint_dist10(rng) << " "
<< normal_dist(rng) << std::endl;
}
Concurrency
One more important reason to prefer <random>
over the traditional rand()
is that it is now very clear and obvious how to make random number generation threadsafe: Either provide each thread with its own, thread-local engine, seeded on a thread-local seed, or synchronize access to the engine object.
Misc
- An interesting article on TR1 random on codeguru.
- Wikipedia has a good summary (thanks, @Justin).
- In principle, each engine should typedef a
result_type
, which is the correct integral type to use for the seed. I think I had a buggy implementation once which forced me to force the seed forstd::mt19937
touint32_t
on x64, eventually this should be fixed and you can sayMyRNG::result_type seed_val
and thus make the engine very easily replaceable.
A random number generator is a equation that, given a number, will give you a new number. Typically you either provide the first number or its pulled from something like the system time.
Each time you ask for a new number it uses the previous number to execute the equation.
A random number generator is not considered very good if it has a tendency to produce the same number more often than other numbers. i.e. if you wanted a random number between one and 5 and you had this distribution of numbers:
- 1: 1%
- 2: 80%
- 3: 5%
- 4: 5%
- 5: 9%
2 is generated FAR more often than any other number, so it is more likely to be produced than other numbers. If all numbers were equally like you would have a 20% chance of getting each number every time. To say it another way, the above distribution is very uneven because 2 is favored. A distribution with all 20%'s would be even.
Typically, if you want a true random number you would pull data from something like weather or some other natural source rather than a random number generator.