How should I test randomness?

Question

Consider a method to randomly shuffle elements in an array. How would you write a simple yet robust unit test to make sure that this is working?

I've come up with two ideas, both of which have noticeable flaws:

Shuffle the array, then make sure its order differs from before. This sounds good, but fails if the shuffle happens to shuffle in the same order. (Improbable, but possible.)
Shuffle the array with a constant seed, and check it against the predetermined output. This relies on the random function always returning the same values given the same seed. However, this is sometimes an invalid assumption.

Consider a second function which simulates dice rolls and returns a random number. How would you test this function? How would you test that the function...

never returns a number outside the given bounds?
returns numbers in a valid distribution? (Uniform for one die, normal for large numbers of dice.)

I'm looking for answers offering insight into testing not only these examples but random elements of code in general. Are unit tests even the right solution here? If not, what sort of tests are?

Just to ease everyone's mind I'm not writing my own random number generator.

Solomon Ucko · Answer

The other answers explain how to make sure the function is random, but don't talk about testing for correctness.
For example, if the function is supposed to generate a random number between 0 and 1, make sure the result is between 0 and 1. If it's supposed to shuffle a list, make sure the input and the output have the same elements. Etc.

Carra · Answer

Let it run a bunch of times and visualize your data.

Here's an example of a shuffle from Coding Horror, you can see that the algorithm is OK or not:

It's easy to see that every possible item is returned at least once (the boundaries are OK) and that the distribution is OK.

user unknown · Answer

Case 1: Testing a shuffle:

Consider an Array [0, 1, 2, 3, 4, 5], shuffle it, what can go wrong? The usual stuff: a) no shuffle at all, b) shuffling 1-5 but not 0, shuffling 0-4 but not 5, shuffling, and always generating the same pattern, ...

One test to catch them all:

Shuffle 100 times, add the values in each slot. The sum of each slot should be similar to each other slot. Avg/Stddev can be calculated. (5+0)/2=2.5, 100*2.5 = 25. Expected value is around 25, for example.

If the values are out of range, there is a small chance, that you got a false negative. You can calculate, how big that chance is. Repeat the test. Well - of course there is a small chance, that the test fails 2 times in a row. But you don't have a routine which automatically deletes your source, if the unit-test fails, do you? Run it again!

It can fail 3 times in a row? Maybe you should try your luck at the lottery.

Case 2: Roll a dice

The dice-roll question is the same question. Throw the dice 6000 times.

for (i in 0 to 6000) 
    ++slot [Random.nextInt (6)];
return (slot.max - slot.min) < threshold;

Wayne Conrad · Answer

To test that a source of random numbers is generating something that at least has the appearance of randomness, I would have the test generate a fairly large sequence of bytes, write them to a temporary file, and then shell out to Fourmilab's ent tool.  Give ent the -t (terse) switch so it will generate easy-to-parse CSV.  Then check the various numbers to see that they are "good."

To decide what numbers are good, use a known source of randomness to calibrate your test.  The test should almost always pass when given a good set of random numbers.  Because even a truly random sequence has a probability of generating a sequence that appears to be non-random, you can't get a test that is certain to pass.  You just pick thresholds that make it unlikely that a random sequence will cause a test failure.  Isn't randomness fun?

Note: You cannot write a test that shows that a PRNG generates a "random" sequence.  You can only write a test that, if it passes, indicates some probability that the sequence generated by the PRNG is "random."  Welcome to the joy of randomness!

Bill the Lizard · Answer

I don't think unit tests are the right tool for testing randomness.  A unit test should call a method and test the returned value (or object state) against an expected value.  The problem with testing randomness is that there isn't an expected value for most of the things you'd like to test.  You can test with a given seed, but that only tests repeatability.  It doesn't give you any way to measure how random the distribution is, or if it's even random at all.

Fortunately, there are a lot of statistical tests you can run, such as the Diehard Battery of Tests of Randomness.  See also:

How to unit test a pseudo random number generator?

Steve Jessop recommends that you find a tested implementation of the same RNG algorithm that you're using and compare its output with selected seeds against your own implementation.
Greg Hewgill recommends the ENT suite of statistical tests.
John D. Cook refers readers to his CodeProject article Simple Random Number Generation, which includes an implementation of the Kolmogorov-Smirnov test mentioned in Donald Knuth's volume 2, Seminumerical Algorithms.
Several people recommend testing that the distribution of the numbers generated is uniform, the Chi-squared test, and testing that the mean and standard deviation are within the expected range.  (Note that testing the distribution alone is not enough.  [1,2,3,4,5,6,7,8] is a uniform distribution, but it's certainly not random.)

Unit Testing with functions that return random results

Brian Genisio points out that mocking your RNG is one option for making your tests repeatable, and provides C# sample code.
Again, several more people point to using fixed seed values for repeatability and simple tests for uniform distribution, Chi-squared, etc.

Unit Testing Randomness is a wiki article that talks about many of the challenges already touched on when trying to test that which is, by its nature, not repeatable.  One interesting bit that I gleaned from it was the following:

I've seen winzip used as a tool to measure the randomness of a file of values before (obviously, the smaller it can compress the file the less random it is).

Gary Rowe · Answer

You can rely on secure random number generators

I just had a horrible thought: you're not writing your own random number generator are you?

Assuming you're not, then you should test the code that you are responsible for, not other people's code (such as the SecureRandom implementation for your framework).

Testing your code

To test that your code responds correctly, it is normal to use a low visibility method to produce the random numbers so that it can be easily overridden by a unit test class. This overridden method effectively mocks out the random number generator and gives you complete control over what is produced and when. Consequently you can fully exercise your code which is the goal of unit testing.

Obviously you will check edge conditions and ensure that the shuffling takes place exactly as your algorithm dictates given the appropriate inputs.

Testing the secure random number generator

If you are uncertain that the secure random number generator for your language is not truly random or is buggy (provides out of range values etc), then you need to perform a detailed statistical analysis of the output over several hundred million iterations. Plot the frequency of occurrence of each number and it should show up with equal probability. If the results skew one way or another you should report your findings to the framework designers. They will definitely be interested in fixing the problem since secure random number generators are fundamental to many encryption algorithms.

anon · Answer

General pointers I've found useful when dealing with code that takes randomized input:
Check the edge cases of expected randomness (max and min values, and the max+1 and min-1 values if applicable).
Check places (on, above, and below) where numbers have inflection points (ie -1, 0, 1, or greater than 1, less than 1 and non-negative for cases where a fractional value might mess up the function).
Check a few places completely outside the allowed input.
Check a few typical cases.
You can also add a random input, but for a unit test that has the undesirable side effect that the same value isn't under test each time the test is run (a seed approach can work though, test the first 1,000 random numbers from seed S or somesuch).

For testing the output of a random function, it is important to identify the goal.  In the case of cards, is the goal to test the uniformity of the 0-1 random generator, to determine if all 52 cards appear in the result, or some other goal (maybe all of this list and more)?

In the specific example, you have to assume your random number generator is opaque (just like it doesn't make sense to unit test the OS syscall or malloc- unless you write OSes).  It may be useful to measure the random number generator, but your goal isn't to write a random generator, just to see that you get 52 cards each time, and that that they change order.

That's a long way of saying that there are really two test tasks here: testing that the RNG is producing the right distribution, and checking that your card shuffle code is using that RNG to produce randomized results.  If you are writing the RNG, use statistical analysis to prove your distribution, if you're writing the card shuffler, make sure there are 52 non-repeated cards in each output (it's a better case for test by inspection that you're using the RNG).

K.Steff · Answer

An aspect of PRNGs that seems forgotten about is that all of its properties are statistical in nature: you can't expect that shuffling an array will result in a different permutation from the one you started with. Basically, if you are using a normal PRNG, the only thing you're guaranteed is that it doesn't use a simple pattern (hopefully) and that it has even distribution among the set of numbers it returns.

A proper test for a PRNG will involve running it at least 100 times and then checking the distribution of the output (which is a direct answer to the second part of the question).

An answer to the first question is almost the same: run the test about 100 times with {1, 2, ..., n} and count the number of times each element has been at each position. They should be all roughly equal if the shuffle method is any good.

An entirely different matter is how to test cryptography-grade PRNGs. This is a matter in which you probably shouldn't dwell, unless you really know what you are doing. People have been known to destroy (read: open catastrophic holes in) good cryptosystems with just a few 'optimizations' or trivial edits.

EDIT: I have thoroughly reread the question, the top answer and my own. While the points I make still stand, I would second Bill The Lizard's answer. Unit tests are Boolean in their nature - they either fail, or they succeed, and are therefore unsuited for testing "how good" are the properties of a PRNG (or a method using a PRNG), since any answer to this question would be quantitative, rather than polar.

sebastiangeiger · Answer

1. Unit test your algorithm

For the first question I would build a fake class that you feed a sequence of random numbers for which you know the outcome of your algorithm. That way you make sure the algorithm you build on top of your random function works. So something along the lines of:

Random r = new RandomStub([1,3,5,3,1,2]);
r.random(); //returns 1
r.random(); //returns 3
...

2. See if your random function makes sense

To the unit test you should add a test that runs multiple times and asserts that the results

are within the boundaries that you set (so, a dice roll is between 1 and 6) and
show a sensible distribution (do multiple test runs and see if the distribution is within x% of what you expected, e.g. for the dice roll you should see a 2 come up between 10% and 20% (1/6 = 16.67%) of the time given that you rolled it 1000 times).

3. Integration test for the algorithm and the random function

How often would you expect your array is sorted in the original sorting? Sort a couple hundred of times and assert that only x% of the time the sorting does not change.

This is actually already an integration test, you are testing the algorithm together with the random function. Once you are using the real random function you can't get away with single test runs anymore.

From experience (I wrote a genetic algorithm) I would say combining the unit test of your algorithm, the distribution test of your random function and the integration test is the way to go.

Telastyn · Answer

There are two parts to this: testing randomization and testing things that use randomization.

Testing randomization is relatively straightforward. You check that the period of the random number generator is as you expect it to be (for a few samples using a few kinda-random seeds, within some threshold) and that the distribution of the output over a large sample size is as you expect it to be (within some threshold).

Testing things that use randomization is best done with a deterministic psuedo-random number generator. Since the output of the randomization is known based on the seed (its inputs), then you can unit test as normal based on inputs vs expected outputs. If your RNG is not deterministic, then mock it with one that is deterministic (or simply not random). Test the randomization in isolation from the code that consumes it.

Matthew Flynn · Answer

Well, you'll never be 100% certain, so the best you can do is that it's probable that the numbers are random. Pick a probability -- say that a sample of numbers or items will come up x times given a million samples, within a margin of error. Run the thing a million times, and see if it's within the margin. Fortunately, computers make this sort of thing easy to do.

Answered by Matthew Flynn on October 29, 2021

How should I test randomness?

11 Answers

1. Unit test your algorithm

2. See if your random function makes sense

3. Integration test for the algorithm and the random function

Add your own answers!

Ask a Question