Compute rand7() using rand5()
3 * rand5() + rand5()
is not uniformly distributed. For example, it generates 0 in only one way, but 3 two ways, so 3 is more likely to occur than 0.
It's just like 2 * rand5() * rand5()
, 4 * rand5() + rand5()
, etc.
But 5 * rand5() + rand5()
is uniformly distributed.
It is like generating two random digits of a base-5 number.
00 => 0
01 => 1
02 => 2
03 => 3
04 => 4
10 => 5
11 => 6
12 => 7
...
There is only and only one way to generate each number from 0 to 24.
If X
and Y
are independent and uniformly distributed on the set S_5 = {0,1,2,3,4}
, then
5*X + Y
is uniformly distributed on the set{0,...,24}
, but3*X + Y
is not uniformly distributed on{0,...,16}
and neither is its restriction on{0,...,13}
It's easy to see that (1) is indeed the case, because f(x,y) = 5*x + y
is a bijection between S_5 x S_5
and S_25
.
If we look at the distribution of 3*X + Y
we get:
>>> Counter(3*x + y for x in range(5) for y in range(5))
Counter({3: 2, 4: 2, 6: 2, 7: 2, 9: 2, 10: 2, 12: 2, 13: 2, 0: 1, 1: 1, 2: 1, 5: 1, 8: 1, 11: 1, 14: 1, 15: 1, 16: 1}
The results 3, 4, 6, 7, 9, 10, 12, 13 are twice as likely as 1, 2, 5, 8 or 11. More proof:
>>> def rand7():
... x = 3*rand5() + rand5()
... if x < 14: return x % 7
... return rand7()
...
>>> Counter(rand7() for _ in xrange(100000))
Counter({6: 18219, 3: 18105, 4: 13734, 5: 13715, 2: 13634, 0: 13560, 1: 9033}
6 and 3 have a 4/22 ~ 18.2% chance of occuring, 4, 5, 2 and 0 have an 3/22 ~ 13.6% chance and 1 only has a 2/22 ~ 9.1% chance. That's one rigged dice.