Mark a six-sided die with the results of six rolls of a previous die. How many iterations until all the faces match

combinatoricsdiceprobabilitypuzzle

This is FiveThirtyEight's "Riddler Classic" puzzle for 27 March, 2020:

From Chris Nho comes a question of rolling (and re-rolling) a die:

You start with a fair 6-sided die and roll it six times, recording the results of each roll. You then write these numbers on the six faces of another, unlabeled fair die. For example, if your six rolls were 3, 5, 3, 6, 1 and 2, then your second die wouldn’t have a 4 on it; instead, it would have two 3s.

Next, you roll this second die six times. You take those six numbers and write them on the faces of yet another fair die, and you continue this process of generating a new die from the previous one.

Eventually, you’ll have a die with the same number on all six faces. What is the average number of rolls it will take to reach this state?

Through numerical simulations, I know that the average number of rows to reach the final state is approximately 9.66, and the PDF of the number of rows to reach this state looks like

enter image description here

My question is: how do we calculate the average number of rows analytically? Is is possible to analytically calculate its PDF as well?

Best Answer

The probability that all sides are the same is $6$ times the probability that all sides are $1$. So we don’t have to worry about the distribution of all numbers on the die, we just have to keep track of the number of $1$s. That yields a Markov chain with $7$ different states, two of which ($0$ and $6$) are absorbing. The transition matrix is

$$ \mathsf P(i\to j)=6^{-6}\binom6ji^j(6-i)^{6-j} $$

(where $0^0=1$), or in matrix form:

$$ P=6^{-6}\pmatrix{ 46656&0&0&0&0&0&0\\ 15625&18750&9375&2500&375&30&1\\ 4096&12288&15360&10240&3840&768&64\\ 729&4374&10935&14580&10935&4374&729\\ 64&768&3840&10240&15360&12288&4096\\ 1&30&375&2500&9375&18750&15625\\ 0&0&0&0&0&0&46656\\ }\;. $$

To my surprise, this matrix has a rather nice eigensystem:

$$ P=\pmatrix{6\\5&5&-5&25&-5&3725&1\\4&8&-4&-8&8&-10576&2\\3&9&0&-27&0&14337&3\\2&8&4&-8&-8&-10576&4\\1&5&5&25&5&3725&5\\&&&&&&6}\\ \times\pmatrix{6\\&38160\\&&120\\&&&2448\\&&&&120\\&&&&&648720\\&&&&&&6}^{-1} \\ \times \pmatrix{1\\&\frac56\\&&\frac59\\&&&\frac5{18}\\&&&&\frac5{54}\\&&&&&\frac5{324}\\&&&&&&1} \\\times\pmatrix{1\\-2681&981&1125&1150&1125&981&-2681\\7&-8&-5&0&5&8&-7\\-14&33&-6&-26&-6&33&-14\\1&-4&5&0&-5&4&-1\\-1&6&-15&20&-15&6&-1\\&&&&&&1}\;, $$

where the first diagonal matrix is just for normalization (so that I could write the matrices containing the left and right eigenvectors with integers) and the second diagonal matrix contains the eigenvalues.

Thus, since we start in state $1$, the probability to have reached state $6$ after $n$ rolls is

$$ -\frac{5\cdot2681}{38160}\left(\frac56\right)^n+\frac{5\cdot7}{120}\left(\frac59\right)^n-\frac{25\cdot14}{2448}\left(\frac5{18}\right)^n+\frac{5\cdot1}{120}\left(\frac5{54}\right)^n-\frac{3725\cdot1}{648720}\left(\frac5{324}\right)^n+\frac16 \\[3pt] = -\frac{2681}{7632}\left(\frac56\right)^n+\frac7{24}\left(\frac59\right)^n-\frac{175}{1224}\left(\frac5{18}\right)^n+\frac1{24}\left(\frac5{54}\right)^n-\frac{745}{129744}\left(\frac5{324}\right)^n+\frac16\;. $$

We need to multiply this by $6$ to get the probability of having reached this state for any of the $6$ numbers on the die in order to get the probablity that the number $N$ of required rolls is less than or equal to $n$:

$$ \mathsf P(N\le n)=1-\frac{2681}{1272}\left(\frac56\right)^n+\frac74\left(\frac59\right)^n-\frac{175}{204}\left(\frac5{18}\right)^n+\frac14\left(\frac5{54}\right)^n-\frac{745}{21624}\left(\frac5{324}\right)^n\;. $$

Here we can check that $\mathsf P(N\le0)=0$ and $\mathsf P(N\le1)=6^{-5}$, as they must be. The probability that we need exactly $n$ rolls is

\begin{eqnarray} \mathsf P(N=n) &=& \mathsf P(N\le n)-\mathsf P(N\le n-1) \\[3pt] &=& \frac{2681}{6360}\left(\frac56\right)^n-\frac75\left(\frac59\right)^n+\frac{455}{204}\left(\frac5{18}\right)^n-\frac{49}{20}\left(\frac5{54}\right)^n+\frac{47531}{21624}\left(\frac5{324}\right)^n \end{eqnarray}

for $n\gt0$ and $\mathsf P(N=0)=0$. Here’s a plot, in agreement with your numerical results. The expected value of $N$ is

\begin{eqnarray} \mathsf E[N] &=&\sum_{n=0}^\infty\mathsf P(N\gt n) \\[3pt] &=& \sum_{n=0}^\infty\left(\frac{2681}{1272}\left(\frac56\right)^n-\frac74\left(\frac59\right)^n+\frac{175}{204}\left(\frac5{18}\right)^n-\frac14\left(\frac5{54}\right)^n+\frac{745}{21624}\left(\frac5{324}\right)^n\right) \\[6pt] &=& \frac{2681}{1272}\cdot\frac6{6-5}-\frac74\cdot\frac9{9-5}+\frac{175}{204}\cdot\frac{18}{18-5}-\frac14\cdot\frac{54}{54-5}+\frac{745}{21624}\cdot\frac{324}{324-5} \\[6pt] &=& \frac{31394023}{3251248} \\[6pt] &\approx&9.656\;, \end{eqnarray}

also in agreement with your numerical results.