Probability – How Many Rolls for a Number to Repeat on a Fair Die?

probabilityprobability theory

PHP developer here so no real math background 🙁 gave up after googling and hence the question

As the question states how many times does a single die have to be rolled for any of the previously rolled numbers to repeat. Meaning if the numbers have not repeated with each successive roll then the solution should take into account all previously rolled numbers to compute the probability of repeating

Finally is there a formula to compute this for arbitrary certainty? Meaning how many time to roll the dice number repeating with 50% or 70% certainty?

Example 1:
First roll: 5
How many more times do I have to roll the die to get a 5 again.

Example 2:
First roll: 1
Second roll: 3
How many more times do I have to roll the dice to get a 1 or 3 again.

Example 3:
First roll: 2
Second roll: 1
Third roll: 4
How many more times do I have to roll the dice to get a 2 or 1 or 4 again.

And so on…

I would greatly appreciate a link to the theory or google terms as I would like to understand this clearly and not just get an answer.

Thank you very much for your time!

Best Answer

As you are guranteed to get a repeat after at most seven rolls, this is a very finite problem and we can simply enumerate all cases.

Here are possible runs of the consecutive dice rolls, where distinct letters stand for distinct numbers coming up, and the rolling is stopped as soon as we have a repeat.

  • $aa$: probability $\frac16$ (need the second roll to be the same as the first)

  • $aba$ or $abb$: probability $\frac56 \cdot \frac26$ (second roll distinct from first, and third roll same as any of the first two)

  • $abc$ and then repeat (that is, $abca$ or $abcb$ or $abcc$): $\frac56 \frac46 \cdot \frac36$

  • $abcd$ and then repeat: $\frac56 \frac46 \frac36 \cdot \frac46$

  • $abcde$ and then repeat: $\frac56 \frac46 \frac36 \frac26 \cdot \frac56$

  • $abcdef$ and then repeat: $\frac56 \frac46 \frac36 \frac26 \frac16 \cdot 1$

Thus, the number of times a dice is rolled before seeing a repeat is $2$ with probability $\frac16$, $3$ with probability $\frac5{18}$, and so on. In general, the probability that we roll $k+1$ times before seeing a repeat is $$\frac{n}{n}\frac{n-1}{n}\frac{n-2}{n}\cdots\frac{n-k+1}{n} \frac{k}{n} = \frac{n!k}{(n-k)!n^{k+1}}$$

From these numbers you can calculate the expected value, the thresholds for 50% or 70% certainty, etc.

For $n=6$, doing these calculations gives the expected number of times you roll the dice until seeing a repeat to be (using $r$ below for $k+1$, the number of rolls): $$\operatorname{E}[r] = \sum_{r=2}^7 r \frac{6!(r-1)}{(6-r+1)!6^r} = \frac{1223}{324} \approx 3.77$$ The cumulative probability distribution looks like the below:

CDF of r

Thus for example, you need to roll the dice at most $5$ times, with probability about $90\%$, at most $4$ times with probability about $70\%$, etc.

Related Question