Solved – Asymptotic consistency with non-zero asymptotic variance – what does it represent

asymptoticsconsistencyconvergencemathematical-statisticsvariance

The issue has come up before, but I want to ask a specific question that will attempt to elicit an answer that will clarify (and classify) it:

In "Poor Man's Asymptotics", one keeps a clear distinction between

  • (a) a sequence of random variables that converges in probability to a constant

as contrasted to

  • (b) a sequence of random variables that converges in probability to a random variable (and hence in distribution to it).

But in "Wise Man's Asymptotics", we can also have the case of

  • (c) a sequence of random variables that converges in probability to a constant while maintaining a non-zero variance at the limit.

My question is (stealing from my own exploratory answer below):

How can we understand an estimator that is asymptotically consistent but also has a non-zero, finite variance? What does this variance reflects? How its behavior differs from a "usual" consistent estimator?

Threads related to the phenomenon described in (c) (look also in the comments):

Best Answer

I won't give a very satisfactory answer to your question because it seems to me to be a little bit too open, but let me try to shed some light on why this question is a hard one.

I think you are struggling with the fact that the conventional topologies we use on probability distributions and random variables are bad. I've written a bigger piece about this on my blog but let me try to summarize: you can converge in the weak (and the total-variation) sense while violating commonsensical assumptions about what convergence means.

For example, you can converge in weak topology towards a constant while having variance = 1 (which is exactly what your $Z_n$ sequence is doing). There is then a limit distribution (in the weak topology) that is this monstruous random variable which is most of the time equal to 0 but infinitesimally rarely equal to infinity.

I personally take this to mean that the weak topology (and the total-variation topology too) is a poor notion of convergence that should be discarded. Most of the convergences we actually use are stronger than that. However, I don't really know what should we use instead of the weak topology sooo ...

If you really want to find an essential difference between $\hat \theta= \bar X+Z_n$ and $\tilde \theta=\bar X$, here is my take: both estimators are equivalent for the [0,1]-loss (when the size of your mistake doesn't matter). However, $\tilde \theta $ is much better if the size of your mistakes matter, because $\hat \theta$ sometimes fails catastrophically.