Estimation – Can $T_n$ Be Considered a Consistent Estimator of $\theta$ by Monte Carlo Simulation Under Given Settings?

consistencyestimationsimulation

Given a iid random samples $X\sim N(\theta,1)$, we have a unknown parameter $\theta$ and its estimator $T_n=T_n(X_1,\dots,X_n)$. If we have strictly proved that $T_n$ is a consistent estimator, can we verify the conclusion from the simulation results?

If I do the Monte Carlo simulation, show that as the sample size $n$ larger and larger (take $n=500, 1000$), the Monte Carlo estimate of mean square error
$$
E[(\theta-T_n)^2]\to 0
$$

where the Monte Carlo estimate of mean square error is given by
$$
\frac{1}{N}\sum_{i=1}^N(\theta-\hat{T}^{(i)}_n)^2
$$

where $N$ is the number of replications.

Can we verify the conclusion under this setting?


Mathematically, I can show that for every $\epsilon>0$,

$$
P(|T_n-\theta|>\epsilon)\le \frac{E[(\theta-T_n)^2]}{\epsilon^2}
$$

So if $E[(\theta-T_n)^2\to 0$, then $T_n$ is consistent estimator of $\theta$.

However, I am not sure if I can conclude this result by Monte Carlo simulation?

Also, in Intuitive understanding of the difference between consistent and asymptotically unbiased. "Asymptotic unbiasedness + vanishing variance" imply "consistency". So if the mean square error is going to zero, then as sample size $n\to \infty$,
$$
E[|\theta-T_n|]\le (E[(\theta-T_n)^2])^{1/2}\to 0
$$

It seems that $T_n$ is asymptotic unbiasedness?

Best Answer

My answer is in terms of leading questions.

You're trying to establish a fact about what happens with $T_n$ in the limit (as $n\to\infty$). You stopped looking at some finite $n$.

You have demonstrated (more or less) that the gap $|T_n-\theta|$ is small at a specific large $n$ and that it tends to decrease as you approach that $n$.

Did your simulation demonstrate what happens at some larger $n$? Or are you just guessing by extrapolating by eye?

Isn't there a gap between what the simulation suggests and what it actually establishes? To cover that gap, you will end up falling back on some further mathematical argument, will you not? (i.e. "this convergence-suggesting pattern should continue like that at larger $n$ because ...").

Imagine your estimator is a black box, and you only have the simulation to go by.

What makes you sure that the wiggly curve continues to shrink toward $θ$?

Why could it not flatten out some small distance from $θ$ and approach no closer? Why could it not at some later stage increase instead? Has a result at $n=500$ ruled these possibilities out for $n>10^{10^{10}}$?

Ultimately, to decide whether you have satisfied 'consistency', we must look at the definition of consistency -- can a simulation satisfy that definition?

You would need some additional argument (and you may well be able to come up with one) that convinces you that it will indeed continue like you see at larger $n$ -- if such an argument is going to be convincing to a skeptic, it will have elements similar to what is needed for the mathematical argument itself.


That's not to suggest that simulations like this are useless, but there's a difference between a clearly suggestive simulation and a proof of consistency. To span that gap requires some form of argument.


Responding to a comment:

I just want to do a simulation to verify the proof result.

Such a simulation can be useful because if the estimator was inconsistent there might well be a suggestion of that in the results, enough to make you doubt the claim. But not seeing such a suggestion doesn't imply that it's not clear at some much larger $n$.

[Ultimately, consistency may matter less than the properties at sample sizes you might actually observe, but that's not the question we're dealing with here. So let us proceed as if we really do care about asymptotic properties.]

Let's consider some inconsistent estimators. Imagine an estimator that (a) had a bias that didn't shrink toward $0$ as $n$ grew very large; or (b) some limiting non-zero variance.

For (a), consider say estimating a mean, and $T_n=\bar{X}+\delta$ for some tiny $\delta$ (this is obviously biased, but we're pretending we have a black box estimator; imagine it's complicated enough that we can't just see it). At some sample size the variability will have shrunk enough that you'll start to "see" that $|\delta|$ is not zero, so if you're lucky enough to look out that far (and you simulate enough there), you'll have a clear suggestion that this is happening (the estimator will appear to hone in on a value that's not $\theta$).

For (b), consider say $T_n=\frac{6}{\pi^2} (X_1+X_2/4+X_3/9+....)$, which approaches a weighted average of the data (with more weight on the earliest points). At some point - if you simulate out far enough* - you'll start to see that the variance of $T_n$ is not clearly shrinking below some value, at which point you might have cause to hold some doubt as to its consistency.

If you do see effects like these the specific problems that appear can give clues about where it might be productive to focus your attention in looking for an error in the proof.

(You might also like to consider what it might look like when both sorts of effects are present.)


* this particular one doesn't need a large $n$ to see it, but I could make a similar sort of estimator than requires a very large $n$ because the weights shrink slowly enough that you can't clearly see the problem at small $n$