Find the variance for the number of runs

probability

A biased coin is tossed $n$ times and heads shows up with probability $p$ on each toss. Let us call a sequence of throws which result in the same outcomes a run, so that for example, the sequence HHTHTTH contains five runs.

If $R$ is a r.v. representing the number of runs then $\mathbb{E}(R) = 1+(n-1)2pq$.

I want to work out the variance $var(R)$.

To do this I would like to use that $var(R) = var(R-1) = \mathbb{E}(R-1)^2 – (\mathbb{E}(R-1))^2$.

Let $I_j$ be the indicator function of the event that the outcome of the $(j+1)$th toss is different from the outcome of the $j$th toss. $I_j$ and $I_k$ are independent if $|j-k| > 1$, so that

\begin{equation*}
\begin{aligned}
\mathbb{E}(R-1)^2 ={} & \mathbb{E}\left\{\left(\sum_{j=1}^{n-1}I_j\right)^2\right\} \\
= {} &\mathbb{E} \left(\sum_{j=1}^{n-1} I_{j}^{2}+2 \sum_{j=1}^{n-2} I_{j} I_{j+1}+2 \sum_{j=1}^{n-3} \sum_{k=j+2}^{n-1} I_{j} I_{k}\right).
\end{aligned}
\end{equation*}

Now $\mathbb{E}(\sum_{j=1}^{n-1} I_{j}^{2}) = (n-1)2pq$ and $\mathbb{E}(2 \sum_{j=1}^{n-2} I_{j} I_{j+1}) = (n-2)2pq$.

We also have that $\mathbb{E}(2 \sum_{j=1}^{n-3} \sum_{k=j+2}^{n-1} I_{j} I_{k}) = (n-3)(n-4)(2pq)^2$ I believe.

But now I have lost confidence and I am not sure how to get the final result for the variance.

Is my approach correct and what should it be in the end?

Best Answer

Yes! Your approach is correct, and the calculations also arrived at correct results [edit: save in the last expectation, where the result should have been $(n-2)(n-3)(2pq)^2$].

Now, the only thing that remains to be done is to plug in the (partial) results to arrive at an expression for the variance.

That is: \begin{align*} \text{Var}(R) &= \text{Var}(R-1) \\ &= \mathbb{E}[(R-1)^2]-(\mathbb{E}[R-1])^2 \\ &= \mathbb{E} \left[\sum_{j=1}^{n-1} I_{j}^{2}\right] + \mathbb{E} \left[2 \sum_{j=1}^{n-2} I_{j} I_{j+1}\right] + \mathbb{E} \left[2 \sum_{j=1}^{n-3} \sum_{k=j+2}^{n-1} I_{j} I_{k}\right] - \left(\mathbb{E} \left[\sum_{j=1}^{n-1}I_j\right]\right)^2 \\ &= (n-1)2pq + (n-2)2pq +(n-2)(n-3)(2pq)^2-((n-1)2pq)^2 \\ &= 2pq(2n-3)+(2pq)^2(-3n+5) \end{align*}