My son recently picked up a copy of Rubiks Race, a kids’ board game that’s essentially a slide puzzle turned into a two-player showdown. You first generate a random target pattern by rolling nine six-coloured dice in a little container, and then each player slides their tiles around to match the pattern. Sometimes the target pattern is invalid – when it has 5 or more of the same colour – but how often exactly?

The fact that you have to re-roll the dice occasionally is hardly a big inconvenience, but it’s a fun exercise to think about the chance of it happening. We tried to guess the rate and we came up with the broad range of somewhere between 1% and 20%.

## Random sampling

One simple way to estimate the rate is to (virtually) roll the dice a lot of times and count how often it happens.

Running this with 10,000 rolls gives these results:

``````2,1586
3,5518
4,2391
5,453
6,45
7,7
``````

So we can see that 505 times out of 10,000 we would have to re-roll, about 5% of the time. More than half of the time, the most common colour will be showing on exactly 3 dice.

## Exhaustive counting

Can we get a more precise answer? We could of course sample more, but in this case there are not that many combinations in total – only 6^9, or about 10 million – so we can simply try them all.

And we get the exact answers:

``````2,1587600
3,5628000
4,2320920
5,472500
6,63000
7,5400
8,270
9,6
``````

So exactly 541176/10077696 times we’ll have to re-roll (about 5.73%).

The only problem remaining with this program is that it’s slow: on my laptop it took about 22 seconds to run. How can we make it faster?

## Faster

One way to make it faster is to observe that the colours are interchangeable so we’re counting separately patterns that look superficially different but are actually the same. (For example, the pattern with 6 reds, 2 greens and 1 blue is spiritually the same as the one with 6 greens, 2 yellows and 1 orange.) We can avoid some of the repetition by forcing the first die to come up, say, Red.

Then we multiply the resulting counts by 6 to get the full total. As expected, the running time is now just 1/6 of before, about 3.5 seconds.

Why is this correct? Think of it this way: we’ve counted all the different patterns that start with a red. But there’s nothing special about red; if we repeated the exercise using blue instead we’d get exactly the same counts. So the counts are correct for every starting colour, we just have to add them up to get the true counts.

## A direct approach

So far we haven’t used any real mathematical knowledge other than counting. (A bit of programming knowledge and spare CPU time can make up for a lot of mathematical ignorance.) If we just want to know how often we’d have to re-roll – without getting the distribution of “number of the most commonly occurring colour” – we can apply some high-school probability theory.

That is, we compute the probability of getting 5, 6, 7, 8 or 9 “red” faces from a roll of 9 dice, then add those up. We multiply the final result by 6 for the same reason as before.

The answer is 22549 % 419904, which is the same as before in reduced form.

## Dynamic Programming

But what if you really, truly, want the distribution of the number of the most commonly occurring colour, and you’re not willing to wait a few seconds to get it? Another approach is dynamic programming:

This gives the same result as the exhaustive counting from earlier, but in the more acceptable running time of a couple of milliseconds.