What's the best way to test this?

IMHO most answers so far have missed the point of the Koan question, with the exception of @Super_Dummy. Let me elaborate on my thinking...

Say that instead of dice, we were flipping coins. Add on another constraint of only using one coin in our set, and we have a minimum non-trivial set that can generate "random" results.

If we wanted to check that flipping the "coin set" [in this case a single coin] generated a different result each time, we would expect the values of each separate result to be the same 50% of the time, on a statistical basis. Running that unit test through n iterations for some large n will simply exercise the PRNG. It tells you nothing of substance about the actual equality or difference between the two results.

To put it another way, in this Koan we're not actually concerned with the values of each roll of the dice. We're really more concerned that the returned rolls are actually representations of different rolls. Checking that the returned values are different is only a first-order check.

Most of the time that will be sufficient - but very occasionally, randomness could cause your unit test to fail. That's not a Good Thing™.

If, in the case that two consecutive rolls return identical results, we should then check that the two results are actually represented by different objects. This would allow us to refactor the code in future [if that was needed], while being confident that the tests would still always catch any code that didn't behave correctly.

TL;DR?

def test_dice_values_should_change_between_rolls
  dice = DiceSet.new

  dice.roll(5)
  first_time = dice.values

  dice.roll(5)
  second_time = dice.values

  assert_not_equal [first_time, first_time.object_id],
    [second_time, second_time.object_id], "Two rolls should not be equal"

  # THINK ABOUT IT:
  #
  # If the rolls are random, then it is possible (although not
  # likely) that two consecutive rolls are equal.  What would be a
  # better way to test this.
end

I'd say the best way to test anything that involves randomness is statistically. Run your dice function in a loop a million times, tabulate the results, and then run some hypothesis tests on the results. A million samples should give you enough statistical power that almost any deviations from correct code will be noticed. You are looking to demonstrate two statistical properties:

  1. The probability of each value is what you intended it to be.
  2. All rolls are mutually independent events.

You can test whether the frequencies of the dice rolls are approximately correct using Pearson's Chi-square test. If you're using a good random nunber generator, such as the Mersenne Twister (which is the default in the standard lib for most modern languages, though not for C and C++), and you're not using any saved state from previous rolls other than the Mersenne Twister generator itself, then your rolls are for all practical purposes independent of one another.

As another example of statistical testing of random functions, when I ported the NumPy random number generators to the D programming language, my test for whether the port was correct was to use the Kolmogorov-Smirnov test to see whether the numbers generated matched the probability distributions they were supposed to match.


There is no way to write a state-based test for randomness. They are contradictory, since state-based tests proceed by giving known inputs and checking output. If your input (random seed) is unknown, there is no way to test.

Luckily, you don't really want to test the implementation of rand for Ruby, so you can just stub it out with an expectation using mocha.

def test_roll
  Kernel.expects(:rand).with(5).returns(1)
  Diceset.new.roll(5)
end