[Math] Variance of the sum of sample means

covarianceprobabilityrandom variablesstatistical-inferencestatistics

Let $X$ be a random variable with normal distribution with mean $ \theta$ and variance $ a>0$. Let $ Y $ be a random, variable with normal distribution with mean $\theta$ and variance $b>0$. Both $a$ and $b$ are known. To estimate $ \theta, $ we chose 2 independent samples of size $n/2$ from $ X_1, …, X_n$ and $Y_1, …, Y_n$ and take

$$T=\frac{\bar{X} + \bar{Y}}{2}$$

to estimate $\theta$. Here the bar denote the sample mean.

I need to calculate the variance of $T$. Can I affirm that it is the sum of the variances of $\bar{X}, \bar{Y}$ or since we want to estimate $\theta$, they have some correlation?

Best Answer

Just because two sequences of RVs have the same mean doesn't mean they are correlated. The notation and wording indicates that we have two sequences, both of which consist of iid samples from their respective distribution. However, it doesn't provide a joint distribution or correlation for X with Y, nor does it pair the samples from X and Y, so there's no way to know if they are correlated (I doubt that you are supposed to assume correlation).

Therefore, we can rely on the additivity of variance to get our answer. However, the variance of $T$ will not be the sum of the variances of $\bar X$ and $\bar Y$ because you have to square the coefficients of a sum of RVs to determine the variance of the sum. Instead, the variance will be $\frac{1}{4}$ the sum of the variances of each sample mean.

Why? Here's the derivation: $Var(T)=Var(\frac{1}{2}(\bar X + \bar Y))=\frac{1}{4}Var(\bar X + \bar Y)= \frac{Var(\bar X) + Var(\bar Y)}{4}$