The maximum likelihood estimator of $N(\theta, \theta)$ is
$$
\bar{\theta}^{MLE}_n = \frac{1}{2}(\sqrt{1 + 4 \overline{x^2}} – 1)
$$
for $\overline{x^2}=\frac{1}{n}\sum_{i=1}^nx_{i}^2$
(See https://math.stackexchange.com/a/3478232/735298 for derivation)
Now playing with this estimator I tried some simple $x_i$s, like,
- $x_1=x_2=…=x_n=1$, that gives me
$$
\bar{\theta}^{MLE}_n = \frac{1}{2}(\sqrt{1 + 4\cdot 1} – 1) = \frac{1}{2}(\sqrt{5} – 1)=0.618…
$$
I expected here to get 1, since intuitively the mean 1 with any variance would be the most probable one.
Also for $x_1=x_2=…=x_n=0$ we get expected $\bar{\theta}^{MLE}_n=0$.
My questions:
- What's wrong with my intuition described above?
- What is the explanation of the $\bar{\theta}^{MLE}_n = 0.618$ for all $x_i=1$?
Best Answer
The problem with the data samples $$x_1=x_2=...=x_n=1$$ is that "are they really generated by $\mathcal{N}(\theta,\theta)$ ?" I mean yes, the mean is $1$. But is their variance $1$ ? Especially when $n$ grows larger and larger !!? No. Roughly speaking, the sampling distribution model has a hard time believing the data.
In contrast, the data samples $$x_1=x_2=...=x_n=0$$ fit the model $\mathcal{N}(\theta,\theta)$ perfectly for $\theta = 0$, and thus you could explain the data using the sampling distribution. That said, you could use any software such as Python or MATLAB to generate pseudorandom numbers sampled from a normal distribution with equal mean and variance. For example, on MATLAB, the following generates $10^4$ samples from $\mathcal{N}(1,1)$:
which gives an answer $\sim 1$