Find the Mean Square Error for a biased estimator

mean square errorstatistics

In my textbook it is said:

… the mean square error for any estimator is equal to its variance plus the square of its bias.

I'm having trouble understanding how do this translate in practice. I went looking for problem questions that tackles this subject.

Here is one,

Consider two estimators $d_1$ and $d_2$ of a parameter $\theta$. If $E[d_1] = \theta$, $var(d_1) = 6$ and $E[d_2] = \theta + 2$, $var(d_2) = 2$, which estimators should be preffered?

The solution for this problem revolves arround computing the Mean Square Error (MSE) for both estimators and then deciding which to pick. Looking up the solution we have this:

Since $d_1$ is an unbiased estimator its MSE is equal to its variance. For $d_2$ the MSE is (variance + square of its bias):
enter image description here

Note: the formula for the $ MSE = r(d_i, \theta) = E[(d_i – \theta)^2] $. Also when you see $E(…)^2$ it means $E[(…)^2]$.

I don't understand where and how the square of the bias is being computed?

Why do we do $+ 2$ after substituing $\theta$ with $\theta + 2$ in the second line?

Where does the part $4E[d_2 – (\theta + 2)] + 4$ comes from? (Is it our square of the bias?)

Why does $E[d_2 – (\theta + 2)] = 0$ ?

Why does $E[(d_2 – (\theta + 2))^2] = var(d_2) = 2$ ? I thought we had a biased estimator?

Best Answer

The point of writing $$d_2-\theta=\color{blue}{d_2-(\theta+2)}+2$$ is to compare $d_2$ to its expected value. Note that the blue expression has zero expectation, and the expectation of its square is just the variance of $d_2$ (by definition of variance).

You can also expand $E[(d_2-\theta)^2]$ without rewriting the expression in this way; you get $$E[d_2^2]-2\theta E[d_2]+\theta^2$$

But the first term is just the variance plus $E[d_2]^2$, which is $(\theta+2)^2$, and the second term is $-2\theta(\theta+2)$. Thus we have

$$2+(\theta+2)^2-2\theta^2-4\theta+\theta^2$$

Simplifying gives the same MSE we obtained the other way, $6$. The first solution is algebraically cleverer but by no means necessary.