Variance of $\hat \mu$ from a Gaussian distribution

calculusexpected valuereal-analysisstatistics

Assume $X \sim \mathcal{N}(\mu, \sigma^2)$ generates an independent sample $X_1, \ldots, X_n$, where $\sigma$ is known. The exercise is to find an unbiased estimator for $\mu^2$ and its variance.

In the textbook, the solutions are listed as $$ \hat \mu^2 = \bar X^2 + \sigma^2/n$$ $$Var(\hat \mu^2) = \mathbb{E}_{\mu}((\bar X – \mu)^2 + 2\mu(\bar X – \mu) – \sigma^2/n)^2 = 4\mu^2 \sigma^2/n + 2\sigma^4/n^2$$

Why is this so?

From what I'm getting, since $T(X)=\sum X_i$ is a sufficient statistic, and $T \sim \mathcal{N}(n\mu, n\sigma^2)$,
$$ \mathbb{E}(T^2) = \mathbb{E}(\sum_{j=1}^{n}X_i)^2=n \mathbb{E}(X_i^2)=n(\sigma^2 + \mu^2)$$
from where it follows that

$$ \mathbb{E}(T^2 – n \sigma^2) = n\mu^2$$

therefore $\hat \mu^2= T^2/n – \sigma^2 \left(=(n \bar X)^2/n – \sigma^2 =n \bar X^2 – \sigma^2\right)$ would seem like an unbiased estimator. However, this is off from the solution by a factor of $n$.

Further, assuming my approach was wrong, I can't see how are they getting the variance, since
$$ \begin{align}\mathbb{E}_{\mu}((\bar X – \mu)^2 + 2\mu(\bar X – \mu) – \sigma^2/n)^2 &= \mathbb{E}(\bar X^2-2\mu \bar X+ \mu^2 + 2\mu \bar X- 2\mu^2 – \sigma^2/n)^2 \\
&= \mathbb{E}(\bar X^2-\mu^2 – \sigma^2/n)^2 \\
\end{align}$$

And isn't $\mathbb{E}\bar X^2 = \mathbb{E}(\frac{1}{n}\sum X_i)^2 = \frac{1}{n^2}\sum(\mathbb{E}X_i^2) = \frac{1}{n}(\sigma^2 + \mu^2)$?

It doesn't seem to add up, or I am missing something. A fresh pair of eyes would help!

Best Answer

You forgot to handle the cross terms. $$E(T^2) = \sum_i \sum_j E[X_i X_j] = n \sum_i E[X_1^2] + n(n-1) E[X_1 X_2] = n (\sigma^2 + \mu^2) + n(n-1) \mu^2 = n \sigma^2 + n^2 \mu^2$$ so $$E\left[\frac{1}{n^2}T^2 - \frac{\sigma^2}{n}\right] = \mu^2.$$ I think a similar correction when computing $E[\bar{X}^2]$ at the end of your post might resolve the variance computation too.

Related Solutions

[Math] unbiased estimator of sample variance using two samples

Apart from the fact that it should be $m-1$ instead of $n-1$ in the right-hand denominator, your estimator for $\sigma^2$ looks fine. You can do slightly better on the variance of $\hat\mu$ (though the question didn't ask to optimize it): Consider a general convex combination

$$ \alpha\frac{X_1+\dotso+X_n}n+(1-\alpha)\frac{Y_1+\dotso+Y_m}{2m} $$

of the individual estimators for $\mu$. The variance of this combined estimator is

$$ n\left(\frac\alpha n\right)^2\sigma^2+m\left(\frac{1-\alpha}{2m}\right)^2\sigma^2=\left(\frac{\alpha^2}n+\frac{(1-\alpha)^2}{4m}\right)\sigma^2\;, $$

and minimizing this by setting the derivative with respect to $\alpha$ to zero leads to $\alpha=n/(n+4m)$, yielding the variance $\sigma^2/(n+4m)$. For $n=m$ the variance is $\frac15\sigma^2/n=0.2\sigma^2/n$, compared to $\frac5{16}\sigma^2/n\approx0.3\sigma^2/n$ for your estimator, and for $n$ fixed and $m\to\infty$ or vice versa, the variance of this estimator tends to zero whereas the variance of your estimator tends to a non-zero value.

You could optimize the variance of your unbiased variance estimator in a similar way, though the calculation would be a bit more involved.

[Math] Confidence Interval and Variance of Coefficient of Variation

Estimating CVs. The coefficient of variation (CV) $\kappa = \sigma/\mu.$ It can be estimated by $\hat \kappa = K = S/\bar X,$ where $\bar X$ and $S$ are the sample mean and SD, respectively. For small $n,$ this estimate is biased on the low side, but for moderate and large samples the bias is small. Methods of finding confidence intervals (CIs) for the CV depend on the nature of the underlying distribution.

Because the type of population distribution may be unknown, it may be useful to use a nonparametric bootstrap CI for the $\kappa.$ Because the population may be skewed (especially right-skewed) in practice, the bootstrap must anticipate skewness.

Because I found the literature on CIs for the CV to be partly hidden behind dollar barriers, and partly poorly explained, I'm wondering if bootstrap CIs may be the best solution for your application. I gave two examples of bootstrap CIs below, one using a sample from a normal population and one using a sample from a gamma population. At least, you can compare these results with results from formulas you may find in your Internet searches.

Bootstrap CIs. If we knew the distribution of $V = K - \kappa,$ we could find bounds $L$ and $U$ cutting 2.5% from its lower and upper tails, respectively to get $P(L < K - \kappa < U) = 0.95,$ from which we would obtain the 95% CI $(K - U, K - L)$ for $\kappa.$

Not knowing the distribution of $V,$ we re-sample from our data $X = (X_1, X_2, \dots, X_n).$ Iteratively we find re-samples of size $n$ with replacement from $X,$ find $K^* = S^*/\bar X^*$ and then $V* = K^* - \kappa^*$ for each re-sample, where the observed CV $K_{obs}$ from the original sample $X$ is used for $\kappa^*.$ Finally, we get $L^*$ and $U^*$ by cutting 2.5% from each tail of the $V^*$'s, the 'bootstrapped' values of $V$, and use these estimated bounds to get the a 95% bootstrap CI.

Examples of Bootstrap CIs. As a demonstration, I use a sample $X$ if $n = 100$ from $\mathsf{Norm}(\mu = 200, \sigma=25)$ with $\kappa = 0.125.$ In the outline above of the bootstrap procedure, $*$'s represented quantities based on re-sampling. In the R program below we use .re for the same purpose.

Note: It is important to understand that re-sampling does not create additional information. Re-sampling exploits information in existing data to do statistical analysis.

Normal. For the particular normal sample we used $K_{obs} = 0.118$, and the 95% nonparametric bootstrap CI obtained is $(0.102, 0.135).$ Because bootstrap procedures involve random re-sampling, each run of the program may give a slightly different CI, but not much different with as many as $B = 10^5 = 100,000$ iterations.

x = rnorm(100,  200, 25)
k.obs = sd(x)/mean(x);  k.obs
## 0.1180088
B = 10^5;  v.re = numeric(B)
for(i in 1:B) {
  x.re = sample(x, 100, repl=T)
  k.re = sd(x.re)/mean(x.re)
  v.re[i] = k.re - k.obs }
UL = quantile(v.re, c(.975,.025))
k.obs - UL
##     97.5%      2.5% 
## 0.1018754 0.1350186

Gamma. This bootstrap procedure is called 'nonparametric' because it does not assume any particular type of distribution for the data. A second sample of size $n = 100$ was taken from the distribution $\mathsf{Gamma}(shape=\alpha = 4, rate=\lambda=.1)$ with $\kappa = \sqrt{\alpha}/\alpha = 1/2.$ This sample has $K = 0.507$ and the 95% nonparametric bootstrap CI is $(0.442, 0.579).$ A second run of the bootstrap program with the same data gave the CI $(0.442, 0.580).$

Best Answer

Related Solutions

[Math] unbiased estimator of sample variance using two samples

[Math] Confidence Interval and Variance of Coefficient of Variation

Related Question