[Math] Pearson product-moment correlation coefficient of a coin toss

correlationcovarianceprobability

A fair coin is tossed 3 times. Let $X$ be a random variable representing the number of $H$'s appeared in the first 2 tosses, $Y$ the number of $H$'s appeared in the last 2 tosses, and $Z$ the number of $T$'s in the last 2 tosses. I need to find the Pearson correlation coefficient of each pair.

One is really straightforward: Since $Z = 2-Y$, then $\rho_{YZ} = -1$. The others I am not sure about. This is my attempt:

We define $X_1$ to be the result of the first toss and $X_3$ to be the result of the third toss (they get 1 when heads). Then $Y = X_3 + X – X_1$. We therefore have that
$$
Cov(X,Y) = E(XY) – EXEY = E\left(XX_3 + X^2 – XX_1\right) – EX\left(EX_3+EX-EX_1\right)
$$
$$
= E(XX_3) + EX^2 – E(XX_1) – EXEX_3 – (EX)^2 + EXEX_1 = Cov(X,X_3) + Var(X) – Cov(X,X_1)
$$
Since $X,X_3$ are independent, $Cov(X,X_3) = 0$.
Also, since $X\sim Bin\left(2,\frac{1}{2}\right)$, we have that $Var(x) = \frac{1}{2}$.

$XX_1$ is not zero only when $X_1 = 1$ and $X > 0$. The probability is
$$
P(X = 1 \cap X_1 = 1) = P(X=1 | X_1=1)P(X_1=1) = \frac{1}{2}\frac{1}{2} = \frac{1}{4}
$$
$$
P(X = 2 \cap X_1 = 1) = P(X = 2 | X_1=1)P(X_1 = 1) = \frac{1}{2}\frac{1}{2} = \frac{1}{4}
$$
and therefore
$$
P(XX_1 = 1) = \frac{1}{4}
$$
$$
P(XX_1 = 2) = \frac{1}{4}
$$
which yields
$$
E(XX_1) = \frac{1}{4} + \frac{1}{2} = \frac{3}{4}
$$
Also, $EX = 1, EX_1 = \frac{1}{2}$. To sum up, we have that
$$
Cov(X,Y) = Var(X) – Cov(X,X_1) = \frac{1}{2} – \frac{3}{4} + \frac{1}{2} = \frac{1}{4}
$$
We only need to calculate the variance of $Y$:
$$
Var(Y) = Var(X_3+X – X_1) = Var(X_3) + Var(X) +Var(X_1) + 2Cov(X,X_3) + 2Cov(X_1,X_3) – 2Cov(X,X_1)
$$
We note that $X_1,X_3$ are independent, and therefore
$$
= Var(X_3) + Var(X) + Var(X_1) – 2Cov(X,X_1) = \frac{1}{4} + \frac{1}{2} + \frac{1}{4} – 2\cdot\left(\frac{3}{4} – \frac{1}{2}\right) = \frac{1}{2}
$$

We therefore have that
$$
\rho_{XY} = \frac{1}{\frac{1}{2}\frac{1}{2}}\frac{1}{4} = 1
$$

The result is that $X,Y$ are linearly correlated, which is pretty weird for me. What is the linear combination of the two? Is my attempt correct?

Best Answer

Your effort was correct, though it could be simpler. You just used the wrong formula for correlation.

Here's my shot.

Let $H_1,H_2,H_3$ be the indicators of a Head on the relevant toss.   There are independent events, with $$\begin{align}\mathsf E(H_n)=&~\tfrac 1 2\\[1ex]\mathsf {Var}(H_n)=&~\tfrac 1 4 \\[1ex] \mathsf {Cov}(H_n,H_m)\vert_{m\neq n} =&~ 0 \\[1ex] \mathsf {Var}(H_n+H_m)\vert_{n\neq m}=&~\tfrac 1 2 \\[2ex]\mathsf {Cov}(X,Y) =&~ \mathsf {Cov}(H_1+H_2,H_2+H_3) \\[1ex] =&~ \mathsf {Cov}(H_1,H_2)+\mathsf {Cov}(H_2,H_2)+\mathsf {Cov}(H_1,H_3)+\mathsf {Cov}(H_2,H_3) \\[1ex]=&~ \tfrac 14 \\[2ex]\mathsf {Corr}(X,Y)=&~ \dfrac{\mathsf {Cov}(X,Y)}{\surd\mathsf {Var}(X)\surd\mathsf {Var}(Y)} \\[1ex] = & ~ \dfrac{1}{2}\end{align}$$


Also $\mathsf {Corr}(X,Z)$ $= \mathsf {Corr}(X,2-Y)\\=-\tfrac 12$.

Related Question