[Math] How many times will a consecutive sequence of throws randomly appear if I throw a four-sided die N times

diceprobability

The problem is related to nucleotides and genomes, but it's better to simplify it using dice.

EDIT 2: My original question wasn't very clear, so I've reworded it

I know that the probability of getting the 7-digit long sequence "4213433" if I throw a four-sided die 7 times is $\frac{1}{4^7}$

But, what is the expected occurrence of the sequence "4213433" if I throw a four-sided die $4.5 \times 10^6$ times?

EDIT 1: Okay, I think I found my answer. If I throw a four-sided die $4.5 \times 10^6$ times, the sequence 4213433 will occur randomly $\frac{4.5 \times 10^6}{4^7}$ or 274.658 times. Is that correct? EDIT 3:Almost. Like @joriki said in the comments, the exact answer would be $\frac{4.5 \times 10^6 – 6}{4^7}$

Best Answer

The answer you gave in your edit is not the answer to your original question. It's the (almost but not quite correct) answer to the question "What is the expected value of the number of sequences 4213433 occurring if I throw a four-sided die $4.5\times 10^6$ times?" This is in fact a much easier question because of the linearity of expectation. You don't have to worry about correlations, you can just add up the expectation values for each of the slots at which the sequence might occur. However, there aren't $4.5\times10^6$ of these, only $4.5\times10^6-6$, so the expected number of occurrences is actually $(4.5\times10^6-6)/4^7$, but that's the same up to the decimal places you specified.

Your original question is a bit harder, since to find the probability of at least one such sequence occurring, you need to take into account that the events of the sequence occurring in overlapping slots aren't independent. For instance, the probability of getting the sequence 44 at least once if you roll the die $3$ times is $7/4^3$, whereas the probability of getting the sequence $43$ at least once is $8/4^3$, even though the expected number of occurrences in both cases is $2/4^2$.