Probability – Expected Number of Dice Rolls Before Rolling ‘1,2,3,4,5,6’

combinatoricsdiceexpected valueprobabilityprobability theory

QUESTION: I roll a single six-sided die repeatedly, recording the outcomes in a string of digits. I stop as soon as the string contains "$123456$". What is the expected length of the string?

My answer so far: My initial approach is to try and find the probability mass function. If we let the random variable $X$ be the length of the string, then we can easily calculate for $x\in\{6,\ldots,11\}$,

$$\mathbb{P}(X=x) = \left(\frac{1}{6}\right)^6$$

and zero for $x<6$.

As soon as we reach $x\ge12$, we need to consider the probability that the final six rolls are "$123456$" but that sequence isn't contained in the string before that. I believe the result for $x\in\{12,\ldots,17\}$ becomes

$$\mathbb{P}(X=x) = \left(\frac{1}{6}\right)^6 – \left(\frac{1}{6}\right)^{12}(x-11).$$

Now for $x\ge18$, we will need an extra term to discount the cases when two instances of "$123456$" are contained before the final six rolls. And indeed every time we reach another multiple of six, we need to consider the number of ways of having so many instances of the string before the final six rolls.

I've messed around with this counting problem but I'm getting bogged down in the calculations. Any input is appreciated to help shed some light on this. Thanks!

Best Answer

Solving a set of linear recurrences is indeed a good, elementary way to go, but if you solve the recurrences in the answer by @Canardini - which I did using wolfram alpha - you find that the answer is $E_X = 46656 = 6^6$. This is such a special number that you might wonder if there is a more fundamental explanation, and indeed there is, using more powerful theorems of Markov Chains.

Claim: If the desired string $x$ has the property that two copies of $x$ cannot overlap (which holds for $x = 123456$ in the OP question but does not hold for e.g. $x=111111$ or $x=121212$), then the expected time to first occurrence of $x$ is $6^L$ where $L$ is the length of $x$.

Consider a Markov Chain with $6^6$ states, where each state is a possible sequence in $\{1,2,3,4,5,6\}^6$ and records the last $6$ rolls. Each state can transition to $6$ states (i.e. it has "out-degree" $6$) with equal prob $1/6$. E.g. the state $\color{red}{1}13462$ can transition to $13462\color{blue}{j}$ where $\color{blue}{j}$ can be any of $\{1,2,3,4,5,6\}$. The red $\color{red}{1}$ represents the oldest die-roll result that has "aged out" and the blue $\color{blue}{j}$ represents the newest die-roll result. Note that each state also has "in-degree" $6$, i.e. only $6$ states can transition to it. (Self-loops are possible and count as both in-degree and out-degree.)

It is obvious such a Markov Chain is aperiodic, positive recurrent, irreducible, ergodic, etc., all the good stuff. Further, because every state's in-degree $=$ out-degree $= 6$, the chain's unique stationary distribution $\pi$ (also its limiting distribution) is the $6^6$-long vector whose every entry is $6^{-6}$.

A powerful (but somewhat "intuitively obvious?") theorem says that, if $\tau_{xx}$ is the revisit time from state $x$ back to state $x$, then:

Theorem: for a positive recurrent Markov Chain, with stationary distribution $\pi, E[\tau_{xx}] = 1 / \pi_x$ for any state $x$.

E.g. see Prop 2.6 of these notes or Theorem 1.19 of these notes or (for a slightly different version) wikipedia

IMHO this theorem is "intuitively obvious" in the following sense: the limiting distribution $\pi$ means in the long run the chain is going to spend $\pi_x$ fraction of the time in state $x$, so it only makes sense that the inter-visit time $\tau_{xx}$ has an expected value of $1/\pi_x$. However, such an "intuitive" argument is not rigorous, and the theorem has a non-trivial proof making use of positive recurrence.

Anyway, based on this theorem, and letting $x=123456$ the state we're interested in, we have $E[\tau_{xx}] = 1/6^{-6} = 6^6$. I.e., if we have just rolled $123456$, then the expected time to roll the next $123456$ is $6^6$. This isn't the same as the OP question. However, if we have just rolled $123456$, then none of these old roll-results can be part of the next $123456$, and therefore this is equivalent to rolling from the very beginning (when the "history" of rolls is the empty string). This is a direct result of the fact that two strings of $123456$ cannot overlap. So the same expected time $6^6$ also answers the OP question.


Addendum: for some other strings, this theorem also gives a quick way to find expected time of first occurrence. E.g. consider $y=111111$. The same theorem says that $E[\tau_{yy}] = 6^6$. But it is also obvious that revisit can either happen right away (if the next roll is $1$) or much later. I.e.:

$$E[\tau_{yy}] = 1 + (\frac16 \times 0 + \frac56 \times E[T_y])$$

where $T_y=$ time to first occurrence of $y$ starting with no useful history (including the case of starting from scratch, i.e. empty history). Solving for this we have:

$$E[T_y] = (6^6 - 1) \times \frac65 = 55986$$

which can be easily verified by solving the corresponding set of linear recurrences for the string $y=111111$.

Related Question