[Math] Understanding the rationale behind “batch means” estimation

ho.history-overviewna.numerical-analysisst.statistics

Hello all,

I am implementing an MCMC algorithm for my work, and I've come upon something in the literature which I just can't understand.

Specifically, I am attempting to estimate the amount of variance in my Monte Carlo estimates for the mean of my function $g$. The first and simplest approach I've seen mentioned repeatedly is the method of batch means estimation. For completeness, I will include a short description here.

As I've read it, the batch means method involves running my Markov chain $\{X_{n}\}$ for some large number of iterations, say $N = ab$, then breaking this long run into $a$ batches of $b$ samples each. We then evaluate the mean estimate of each batch by computing
$$ Y_{k} = \frac{\sum_{i = (k-1)b + 1}^{kb}g(X_{i})}{b},$$
where $g$ is the function I am computing the expected value of. These average estimates are themselves averaged to yield an overall estimate of our expected value,
$$\hat{\mu} = \frac{\sum_{i = 1}^{a} Y_{i}}{a}.$$

So far, I can understand where these computations come from. It's this next part I don't quite get. I want to present some estimate of the Monte Carlo Standard Error (MCSE) with the results I compute, and so I am trying to compute the variance of these estimates. The equation I've seen several times without any explanation is:

$$\hat{\sigma}^{2} = \frac{b}{a-1}\sum_{k=1}^{a}(Y_{k}-\hat{\mu})^{2}.$$

This seems to be very close to the sample variance, as I would compute it:

$$s^{2} = \frac{1}{n-1} \sum_{i = 1}^{n}(x_{i} – \hat{x})^{2},$$
except that there is a factor of $b$ multiplied on. Can anyone help me understand why we multiply by the number of samples here?

Thanks for your time and consideration.

Best Answer

The variance of the mean of $n$ random variables is

$\mbox{Var}(\bar{x})=\mbox{Var}(\sum_{i=1}^{n} x_{i}/n)$

$\mbox{Var}(\bar{x})=\sum_{i=1}^{n} (1/n)^{2} \mbox{Var}(x_{i})$

$\mbox{Var}(\bar{x})=n(1/n)^{2} \mbox{Var}(x_{i})$

$\mbox{Var}(\bar{x})=(1/n) \mbox{Var}(x_{i})$

Thus

$\mbox{Var}(x_{i})=n\mbox{Var}(\bar{x})$

Related Question