Understanding MLE of $N(\theta, \theta)$

maximum likelihoodnormal distributionstatistics

The maximum likelihood estimator of $N(\theta, \theta)$ is

$$
\bar{\theta}^{MLE}_n = \frac{1}{2}(\sqrt{1 + 4 \overline{x^2}} – 1)
$$

for $\overline{x^2}=\frac{1}{n}\sum_{i=1}^nx_{i}^2$

(See https://math.stackexchange.com/a/3478232/735298 for derivation)

Now playing with this estimator I tried some simple $x_i$s, like,

  • $x_1=x_2=…=x_n=1$, that gives me

$$
\bar{\theta}^{MLE}_n = \frac{1}{2}(\sqrt{1 + 4\cdot 1} – 1) = \frac{1}{2}(\sqrt{5} – 1)=0.618…
$$

I expected here to get 1, since intuitively the mean 1 with any variance would be the most probable one.

Also for $x_1=x_2=…=x_n=0$ we get expected $\bar{\theta}^{MLE}_n=0$.

My questions:

  • What's wrong with my intuition described above?
  • What is the explanation of the $\bar{\theta}^{MLE}_n = 0.618$ for all $x_i=1$?

Best Answer

What's wrong with my intuition described above? What is the explanation of the $\bar{\theta}^{MLE}_n = 0.618$ for all $x_i=1$?

The problem with the data samples $$x_1=x_2=...=x_n=1$$ is that "are they really generated by $\mathcal{N}(\theta,\theta)$ ?" I mean yes, the mean is $1$. But is their variance $1$ ? Especially when $n$ grows larger and larger !!? No. Roughly speaking, the sampling distribution model has a hard time believing the data.

In contrast, the data samples $$x_1=x_2=...=x_n=0$$ fit the model $\mathcal{N}(\theta,\theta)$ perfectly for $\theta = 0$, and thus you could explain the data using the sampling distribution. That said, you could use any software such as Python or MATLAB to generate pseudorandom numbers sampled from a normal distribution with equal mean and variance. For example, on MATLAB, the following generates $10^4$ samples from $\mathcal{N}(1,1)$:

x = 1 + randn(10000,1)
0.5*(sqrt(1 + 4*mean(x.^2)) - 1)

which gives an answer $\sim 1$

Related Question