Calculate the correlation between the number of heads of 100 toss of coin and the number of heads of the first 10 toss of those 100 tosses

correlationprobabilitystatistics

It is a bit cumbersome to explain:

Toss a coin is a Bernoulli distribution, with the probability of seeing a head is p

if we toss this coin 100 times, we should expect $X_{1}$ times of head. Within that 100 toss (this is important, we are NOT tossing another 10 times), we should see $X_{2}$ heads from the first 10 toss.

How to calculate $corr(X_{1}, X_{2})$ ?

The only thing I can think of is, $X_{1} > X_{2}$, practially, we are doing two sets:

  1. toss a coin 10 times, we see $X_{2}$ heads
  2. independently toss a coin 90 times, we see $X_{3}$ heads

we want to calculate $corr(X_{2}, X_{2} + X_{3})$

Best Answer

Let $Y_i$ take value $1$ if toss $i$ gives heads and let it take value $0$ otherwise.

Then to be found is:$$\mathsf{Corr}\left(\sum_{i=1}^{10}Y_i,\sum_{j=1}^{100}Y_j\right)=\frac{\mathsf{Cov}\left(\sum_{i=1}^{10}Y_i,\sum_{j=1}^{100}Y_j\right)}{\sqrt{\mathsf{Var}(\sum_{i=1}^{10}Y_i)}\sqrt{\mathsf{Var}(\sum_{i=1}^{100}Y_i)}}$$

Note that on base of bilinearity of covariance, independence and symmetry: $$\mathsf{Cov}\left(\sum_{i=1}^{10}Y_i,\sum_{j=1}^{100}Y_j\right)=\sum_{i=1}^{10}\sum_{j=1}^{100}\mathsf{Cov}(Y_i,Y_j)=10\mathsf{Cov}(Y_1,Y_1)=10\mathsf{Var}Y_1$$

Further, also on base of independence and symmetry: $$\mathsf{Var}\left(\sum_{i=1}^{10}Y_i\right)=10\mathsf{Var}Y_1$$and: $$\mathsf{Var}\left(\sum_{i=1}^{100}Y_i\right)=100\mathsf{Var}Y_1$$

Leading to answer: $$\frac1{\sqrt{10}}$$

It is not even necessary to calculate $\mathsf{Var}Y_1$.