In my script for statistical signals, I have some troubles to get the same result for the **variance of an estimator** $T$.

## Here is the example:

Given the observations $X_1, \dots , X_N$ of a uniquely distributed random variable

$$

X: \Omega \rightarrow [0,\theta]

$$

with $[0, \theta] \subset \mathbb{R}$ such that the CDF $F_X (\xi) = \frac{\xi}{\theta}$ and the PDF $f_X(xi) = \frac{1}{\theta}$ if $0 \leq \xi \leq \theta$.

To estimate the upper bound $\theta$ of the uniform distribution, the expected value $\mathbb{E}[X] = \frac{\theta}{2}$ is used which is the mean of this partictular uniform distribution. Then the estimator $T$ is given as:

$$

T = 2 \cdot \underbrace{\frac{1}{N} \sum_{i = 1}^N X_i}_{Average}: \quad x_1, \dots , x_N \rightarrow \hat{\theta}

$$

Now the expected value for this estimator is calculated as:

$$

\mathbb{E}[T(X_1, \dots , X_N)] = \mathbb{E}[\frac{2}{N}\sum_{i = 1}^{N}X_i] = \frac{2}{N}\sum_{i = 1}^{N}\mathbb{E}[X_i] = \frac{2}{N} \cdot N \cdot \frac{\theta}{2} = \theta

$$

which makes this estimator unbiased since the expected value is exactly the wanted parameter $\theta$.

Finally the Variance $Var[T]$ is just given as:

$$

Var[T] = \frac{\theta^2}{3N}

$$

However, I don't know how to obtain this result.

# I tried two approaches to get the same result for the variance:

In my first approach, using the definition of the variance, depending on the expected value of the estimator, there seems to be an error in my calculations or something that I miss.

The second approach gives me the same result but I don't really understand the properties/rules for the variance I used.

## First approach:

According to the definition the variance is:

$$

Var[T(X_1, \dots , X_N)] = \mathbb{E}[(T-\mathbb{E}[T])^2] = \mathbb{E}[T^2] – \mathbb{E}[T]^2

$$

Now in order to get the variance, $\mathbb{E}[T^2]$ and $\mathbb{E}[T]^2$ are needed. $\mathbb{E}[T]^2 = \theta^2$ is already available (see above). In my calculations of $\mathbb{E}[T^2]$ seems to be an **error**:

$$

\mathbb{E}[T^2] = \mathbb{E}\left[\frac{4}{N^2}\left(\sum_{i = 1}^N X_i\right)^2\right]

$$

Because of my assumed independence of $X_i, X_j$ this equation results in (where I am really not sure if this step is correct):

$$

\mathbb{E}[T^2] = \frac{4}{N^2}\sum_{i = 1}^N \mathbb{E}[X_i^2]

$$

Now using functions of random variables $E[g(x)] = \int_{\mathbb{R}}{g(x)f_x (x)dx}$, where $g(x) = x^2$, the previous equation is:

$$

\mathbb{E}[T^2] = \frac{4}{N^2}N \int_{0}^{\theta}{x^2 \frac{1}{\theta} dx} = \frac{4}{N} \frac{\theta^3}{3} \frac{1}{\theta} = \frac{4}{N} \frac{\theta^2}{3}

$$

However, this obviously leads to:

$$

Var[T] = E[T^2] – E[T]^2 = \frac{4}{N} \frac{\theta^2}{3} – \theta^2 = \frac{\theta^2 (4 – N3)}{N 3}

$$

which is wrong. Can you please tell me where I made a mistake?

## Second approach using properties of variances (correct result):

$$

Var[T] = Var[\frac{2}{N} \sum_{i = 1}^N X_i] = \frac{4}{N^2} Var[\sum_{i = 1}^N X_i] = \frac{4}{N^2} \sum_{i = 1}^N Var[X_i] = \frac{4}{N^2} N \frac{\theta^2}{12} = \frac{\theta^2}{3}

$$

where I used the variance of the uniform distribution $Var[X_i] = \frac{\theta^2}{12}$ and the following rules for variance:

$$

Var[\alpha X + \beta] = \alpha^2 Var[X]

$$

and

$$

Var[\sum_{i = 1}^N X_i] = \sum_{i = 1}^N Var[X_i] + \sum_{i \neq j} Cov[X_i,X_j]

$$

For the second rule I assumed independence for the statistics $X_i$ which leads to $\sum_{i \neq j} Cov[X_i,X_j] = 0$. Though I am not sure if this is right but it lead to the correct result.

Could you please tell me how to derive these rules?

I would be glad to get the variance using my first approach with the formulas I mostly understand and not the second approach where I have no clue where these rules of the variance come from.

## Best Answer

Why $\text{Cov}(X, Y) = 0$ if $X$ and $Y$ are independent?By definition, $\text{Cov}(X, Y) = \mathbb{E}((X - \mu_X)(Y - \mu_Y))$. Hence, \begin{align*} \text{Cov}(X, Y) &= \mathbb{E}(XY - \mu_YX - \mu_XY + \mu_X \mu_Y) \\ &= \mathbb{E}(XY) - \mathbb{E}(\mu_YX) - \mathbb{E}(\mu_XY) + \mathbb{E}(\mu_X \mu_Y) ~ (\text{By linearity of expectation}) \\ &= \mathbb{E}(XY) - \mu_Y\mathbb{E}(X) - \mu_X\mathbb{E}(Y) + \mu_X \mu_Y ~ (\mu_X ~ \text{and} ~ \mu_Y ~ \text{are constants)} \\ &= \mathbb{E}(XY) - \mathbb{E}(X)\mathbb{E}(Y)! \\ \end{align*} Since $\mathbb{E}(XY) = \mathbb{E}(X)\mathbb{E}(Y)$ if $X$ and $Y$ are independent, $\text{Cov}(X, Y) = 0$.

Next, why $\text{Var}(\sum^n_{i = 1}X_i) = \sum^{n}_{i = 1}\text{Var}(X_i)$ if $X_1, X_2, \ldots, X_n$ are independent?Take $n = 2$, \begin{align*} \text{Var}(X_1 + X_2) &= \mathbb{E}((X_1 + X_2)^2) - (\mathbb{E}(X_1 + X_2))^2 ~ (\text{definition}) \\ &=\mathbb{E}(X_1^2 + 2X_1X_2 + X_2^2) - (\mathbb{E}(X_1) + \mathbb{E}(X_2))^2 ~ (\text{expansion and linearity of expectation}) \\ &= \mathbb{E}(X^2_1) + 2\mathbb{E}(X_1X_2) + \mathbb{E}(X_2^2) - \mathbb{E}(X_1)^2 - 2 \mathbb{E}(X_1) \mathbb{E}(X_2) - \mathbb{E}(X_2)^2 \\ &= \mathbb{E}(X^2_1) - \mathbb{E}(X_1)^2 + \mathbb{E}(X^2_2) - \mathbb{E}(X_2)^2 ~ (\text{again, }\mathbb{E}(XY) = \mathbb{E}(X)\mathbb{E}(Y) \text{ by assumption}) \\ &= \text{Var}(X_1) + \text{Var}(X_2) \end{align*}

Back to the original question, what is $\text{Var}(T)$?\begin{align*} \text{Var}(2\frac{\sum^n_{i = 1}X_i}{n}) &= \frac{4}{n^2} \sum_{i = 1}^n \text{Var}(X_i) \\ &= \frac{4}{n^2} \times \frac{n\theta^2}{12} \\ &= \frac{\theta^2}{3n} \end{align*} Done!