[Math] Finding the expected value in the given problem.

expected valueprobability theoryrandom variables

It is given that a monkey types on a 26-letter keyboard with all the keys as lowercase English alphabets. Each letter is chosen independently and uniformly at random. If the monkey types 1,000,000 letters, what is the expected number of times the sequence "proof" appears?

Here is the suggested solution:

Let the random variable $X_i = 1$ if the word "proof" appears at the index $i$ else $X_i = 0$.
Let $n = 1,000,000$
Hence, the expected value of number of appearances of the word is:

\begin{align}
&E\left[ \sum\limits _{i=1}^{n-4}X_i \right] & & \text{Eqn 1} \\
&= \sum\limits _{i=1}^{n-4}E[X_i] & & \text{Eqn 2}
\end{align}

Now $E[X_i] = 26^{-5}$.

Hence expected number of appearances = $(n-4)\cdot26^{-5}$

(Note that the upper limit is $n-4$ because the word proof is of length $5$. Hence it can at most start at the index $n-4$ to finish the index $n$)

Now here is my doubt. We know that if $X_i = 1$, then the following few random variables like $X_{i+1}$,$X_{i+2}$ etc, can't be $1$ because you cannot have a word proof starting at index $i$ and then another word proof starting at index i+1. Where in this proof have we imposed that restriction?

Best Answer

Note that $E[X+Y]=E[X]+E[Y]$ holds in full generality, even if $X$ and $Y$ are not mutually independent. Proofs of linearity of expectation do not assume independence of $X$ and $Y$. Here's one for example. In other words you do not need to impose that restriction.

Related Question