Ruby: Using rand() in code but writing tests to verify probabilities

I think you should separate your goals. One is to stub Kernel.rand as you mention. With rspec for example, you can do something like this:

test_values = [1, 2, 3]
Kernel.stub!(:rand).and_return( *test_values )

Note that this stub won't work unless you call rand with Kernel as the receiver. If you just call "rand" then the current "self" will receive the message, and you'll actually get a random number instead of the test_values.

The second goal is to do something like a field test where you actually generate random numbers. You'd then use some kind of tolerance to ensure you get close to the desired percentage. This is never going to be perfect though, and will probably need a human to evaluate the results. But it still is useful to do because you might realize that another random number generator might be better, like reading from /dev/random. Also, it's good to have this kind of test because let's say you decide to migrate to a new kind of platform whose system libraries aren't as good at generating randomness, or there's some bug in a certain version. The test could be a warning sign.

It really depends on your goals. Do you only want to test your weighting algorithm, or also the randomness?


It's best to stub Kernel.rand to return fixed values.

Kernel.rand is not your code. You should assume it works, rather than trying to write tests that test it rather than your code. And using a fixed set of values that you've chosen and explicitly coded in is better than adding a dependency on what rand produces for a specific seed.


If you wanna go down the consistent seed route, look at Kernel#srand:

http://www.ruby-doc.org/core/classes/Kernel.html#M001387

To quote the docs (emphasis added):

Seeds the pseudorandom number generator to the value of number. If number is omitted or zero, seeds the generator using a combination of the time, the process id, and a sequence number. (This is also the behavior if Kernel::rand is called without previously calling srand, but without the sequence.) By setting the seed to a known value, scripts can be made deterministic during testing. The previous seed value is returned. Also see Kernel::rand.