This is a Monte Carlo simulation, though I doubt it's doing what you want, and really is of no practical use. What you are comparing is 10000 single sample studies and determining how many of these individual observed values are on average higher. So, it's probably better to conceptualize your code as the following less efficient code:
runs <- 10000
result <- logical(runs)
for(i in 1:runs){
a <- rnorm(1, mean = 0)
b <- rnorm(1, mean = 0)
result[i] <- a > b
}
mc.p.value <- mean(result)
The above code, when the distributions are equal, should show that $a$ is higher than $b$ 50% of the time, but this type of result is essentially useless in practice because you would only have $N=2$, and statistical inferences are not applicable (i.e., there is no variance within each group available to quantify sampling uncertainty).
What you are missing is comparing two summary statistics of interest to determine their sampling properties, which is generally the topic of statistical inference and requires at least a minimum number of data points to quantify some form of sampling uncertainty.
As it stands this is generally a standard independent $t$-test set-up. So, you could compare a number of things, such as how often the first group mean is higher than the second, how often a $t$-test gives a $p < \alpha$ result (or analogously, how often the $|t|$ ratio is greater than the cut-off), and so on.
E.g., If $n=20$ for each group and the distributions are equal in the population then
runs <- 10000
n <- 20
alpha <- .05
result.p <- result.mean <- numeric(runs)
for(i in 1:runs){
a <- rnorm(n, mean = 0)
b <- rnorm(n, mean = 0)
result.mean[i] <- mean(a) > mean(b)
result.p[i] <- t.test(a, b)$p.value
}
mc.p.value <- mean(result.p < alpha)
mc.a_gt_b.value <- mean(result.mean)
Playing around with the other parameters, such as changing the first group mean to 1, will change the nature of the simulation (changing it from a Type I error simulation, as it is now, to a power study).
I'm not sure if it will work 100%, but to show that your estimator has the convergence rate of $O_p(N^{-\frac{1}{3}})$ and you want assess it, from the definition for any given $\epsilon > 0$:
$$\lim_{n \rightarrow \infty} Pr(|T_n - \theta| > \epsilon) = 0 $$
So to evaluate this visualize I would suggest fixing some small $\epsilon$ and for each $n$ simulate $|T_n - \theta|$ several times, and calculate the empirical $Pr(|T_n - \theta| > \epsilon) $ and plot it. And with this you can see how the convergence will behave against $N^{-\frac{1}{3}}$. BTW it isn't prove anything, you need to make the theoretical calculations to demonstrate any statement.
Best Answer
From your expression above for $m_n$, you can derive the following:
$$P[X^n > m_n] = 1 - \Phi(n/2)$$
For $n=10$, that's less than 0.3 per million, so the vast majority of samples will be smaller than $m_n$ (and it's not unusual to not have a single sample out of 1 million turn out to be larger than $m_n$). It's the rare but very large values that pull up the moment, as you mention, and it gets more evident for larger moments.
One relatively simple method for improving a Monte Carlo method that depends critically on rare events is to use importance sampling and "tweak" the distribution to make those events more likely, and then weigh our samples to get back to the distribution we're actually trying to sample from. One such approach is called exponential twisting (or tilting). For example, for a normal distribution $Y\sim \mathcal{N}(\mu,\sigma^2)$, consider the twisted random variable:
$$Y_\theta \sim \mathcal{N}(\mu + \theta\sigma^2,\sigma^2)$$
For some function $f$ of interest, you can show that:
$$E(f(Y)) = E_{\theta}\left[f(Y)p(Y)/p_{\theta}(Y)\right]=E_{\theta}\left[f(Y)e^{(\mu-Y)\theta+\sigma^2\theta^2/2}\right]$$
The expectation $E$ is taken with respect to the distribution of $Y$, while $E_{\theta}$ is with respect to that of $Y_{\theta}$. The parameter $\theta$ can be tuned depending on the problem at hand (e.g. which events we need to make more likely), and $\theta = 0$ corresponds to the regular MC method.
The above identity means that you can then estimate the expectation of interest on the left by Monte Carlo like this:
For your specific problem, you would take $\mu = 10$, $\sigma^2 = 100$ and $f(x) = e^x$.
For a range of values of $\theta$, sampling only 1000 $Y_i$'s for each value, I obtain the following estimates (the horizontal line is the exact value):
We see that the regular MC method ($\theta=0$) performs poorly, and that values near $\theta=1$ produce a result much closer to the true value.
Obviously, this is a somewhat contrived example: if I simply take $\theta = 1$, then $Y_i = e^{\mu+\sigma^2/2}$ for all $i$, and I get the exact result with no sampling variation, which we already knew from the beginning. The method can still be used in other situations where a closed-form solution or other efficient numerical approximation does not exist.