Variance for Hit-and-Miss Monte Carlo is given by $Var(\theta)=\frac{\Theta*(1-\Theta)}{N}$ where $\theta$ is the estimated probability of Hit and N is the number of simulations. Can someone explain why? And what will be the variance when Importance Sampling is used?
Solved – Variance for hit-and-miss Monte Carlo method and importance sampling
importance-samplingmonte carlo
Related Solutions
Importance sampling has exactly the same validation as the basic Monte Carlo approach. At its core, it is basic Monte Carlo. Indeed, it is simply a change of reference measure, going from $$ \int h(x) f(x) \text{d}x $$ to $$ \int h(x) \dfrac{f(x)}{g(x)} g(x) \text{d}x $$ Thus the convergence is garanteed by the law of large numbers in both cases, i.e whether you simulate from $f$ or from $g$. In addition, if the term $$ \int h^2(x) \dfrac{f^2(x)}{g(x)} \text{d}x $$ is finite, the central limit theorem also applies and the speed of convergence is $\text{O}(1/\sqrt{n})$. If it "takes so long in practice", it is because the above variance factor in the CLT can be quite large. But, and I insist, the speed is the same as with regular Monte Carlo, $\text{O}(1/\sqrt{n})$.
The quality of an importance sampling distribution is thus directly related with the above variance factor, which goes to zero for the "zero-variance distribution" proportional to $|h(x)|f(x)$.
First, they compare crude (M) and hit-or-miss (H) there; they don't mention 'improvements' until after that section where the comparison is done.
The page apparently contains a typographical error.
The crude estimate ($M$) has a smaller variance, $\sigma^2_M$ than the hit-and-miss estimate's ($H$) variance, $\sigma^2_H$. That is, $\sigma^2_M<\sigma^2_H$
As requested, I will illustrate this via a simple example. Consider using both methods on the following integration problem:
that is, $\int_0^1 f(x) dx$ where $f(x)=x,\quad 0<x<1$.
Here's the results of 1000 simulations with 1000 points of each kind of estimator:
As you see, $\sigma^2_M$ has the smaller variance, so the claim from the page that $\sigma^2_M-\sigma^2_H>0$ is a mistake.
I did the simulations in R:
x=matrix(runif(1000000),nr=1000)
y=matrix(runif(1000000),nr=1000)
crude=colMeans(x)
mean(crude)
[1] 0.5001032
sd(crude)
[1] 0.009110159
hitmiss=colSums(y<x)/1000
mean(hitmiss)
[1] 0.500518
sd(hitmiss)
[1] 0.01622797
We can also calculate the variance of the two algebraically.
The variance of the crude estimate for my example problem is simply $\frac{1}{n} \text{Var}(X)$ where $X$ is uniform on $(0,1)$; that is, it should be 1/(12n) for this problem.
The variance of the hit-or-miss estimate is given here. For our problem, $\theta=1/2$, so the variance for this problem should be 1/(4n).
The standard deviations for $n=1000$ should then be about 0.00913 for the crude and about 0.0158 for the hit-and-miss. The above simulations came out pretty close to those values (0.00911 vs 0.0162).
Note that the integral they give there, $\frac{1}{n}\int_0^1 f(u)(1-f(u))du$ works out to be 1/(6n) for our toy problem here... which happens to be the same as 1/(4n) - 1/(12n) ... which is $\sigma^2_H-\sigma^2_M$ in our example. That's no accident, and - as I said at the start - apparently it's just a typo.
That is, they just got the subscripts backward; they actually prove that $\sigma^2_H-\sigma^2_M >0$ (edit: their result isn't true in all circumstances; they must elsewhere restrict the class of functions)
Best Answer
The first is just the variance of $\hat{p} = X/n$ in a binomial.
If $X$ is the number of "hits" in $n$ trials,
$$\newcommand{\Var}{\operatorname{Var}}\Var(X) = n\theta(1-\theta)$$
(Wikipedia on the Binomial Distribution).
So
$$\Var(X/n) = \frac{1}{n^2} n\theta(1-\theta) = \frac{\theta(1-\theta)}{n}.$$
For a comparison of the variance between the plain binomial case and importance sampling, see this section of the Wikipedia article on importance sampling.