Solved – Importance Sampling to evaluate integral in R

importance-samplingmonte carlonumerical integrationrsampling

I have asked the question here also.
However, there might be something wrong with my theoretical understanding hence I'm asking here as it is more relevant. Kindly do not diss without looking first.

I have referred to these links –
here and here. The latter link is a question of mine, but the answers, as they turn out, aren't quite right.

Hence I am creating asking a question with new attempts made to answer it.

  • I have the following integral
    $$\int_{0.01}^1 x^{-0.5}\,\text{d}x$$
    To solve this using Importance Sampling MC integration, one needs to select an importance pdf that is approximately the same as the function plot

My R code to solve the same is this :

#function 1 - importance sampling
w <- function(x) dunif(x,0.01,1)/dbeta(x,0.7,1)
f <- function(x) x^(-0.5)
X <- rbeta(1000,0.7,1)
Y <- w(X)*f(X)
c(mean(Y),var(Y))

True integral value – 1.8

Using the Importance Sampling code above – 1.82 (where my importance PDF is Beta(0.7,1)

which is quite alright so I'm assuming the code is correct.

  • Now I have this integral
    $$\int_{0.3}^8 [1+\sinh(2x)\log(x)]^{-1}\text{d}x$$

for which my code is :

#function 2 
w <- function(x) dunif(x,0.01,1)/dnorm(x,0.5,0.25)
f <- function(x) (1+sinh(2*x)*log(x))^(-1)
X <- rnorm(1000,0.5,0.25)
Y <- w(X)*f(X)
Y <- Y[!is.na(Y)]
c(mean(Y),var(Y))

True Integral Value ~0.7014

Value from executing above code ~3.25 (where my importance PDF is normal(0.5,sd=0.25)


What am I doing wrong?

1) Take function to be evaluated as f(x).

2) Generate samples from Importance PDF g(x) that is truncated between the intervals.

3) Get the mean(f(x)/g(x)) which is the integral.

Best Answer

When running the code provided for the second function I get

> c(mean(Y),var(Y))
[1] 3.2981238 0.5203621
> integrate(f,0.01,1)
3.19264 with absolute error < 1.1e-06

which means that the true value of the integral is close to 3.2, not to 0.70.

If you want to integrate f from 0.3 to 8, then the importance function must be

> w <- function(x) dunif(x,0.3,8)/dnorm(x,0.5,0.25)

and I changed the handling of the NA's in the problem by

> f <- function(x) (x>0)*(1+sinh(2*x)*log(abs(x)))^(-1)

which leads to

> Y <- w(X)*f(X)
> integrate(f,0.3,8)
2.77512 with absolute error < 2.4e-05
> integrate(f,0.3,8)$val/(8-.3)
[1] 0.3604052

which shows a good fit even though the Normal importance function N(0.5,0.5) may be too concentrated, i.e., it has too small a variance to cover (0.3,8) with some reasonable probability. If one changes the scale of the Normal,

> X <- rnorm(1e5,0.5,2.25)
> Y <- w(X)*f(X)
> c(mean(Y),var(Y))
[1] 0.3664046 1.0230197
> integrate(f,0.3,8)$va/(8-.3)
[1] 0.3604052

which shows a fairly good fit. When using a smaller standard deviation like $\sigma=0.1$, I find a poorer fit

> c(mean(Y),var(Y))
[1]   0.3370849 484.6246147

even when compared with $\sigma=1$:

> c(mean(Y),var(Y))
[1] 0.3606815 0.3232931

To make sense of those variations, I ran the experiment for a range of values of $\sigma$, from $0.1$ to $4$, with $10^6$ normal simulations, and got the following graph (with a log scale on the first axis)

enter image description here

which shows the improvement brought by large enough $\sigma$'s. And an optimum around $\sigma=0.74$.

Here is the relevant R code

w <- function(x) dunif(x,0.3,8)/dnorm(x,0.5,sigma) 
sigs=varz=meanz=seq(.1,4,le=50)
for (i in 1:50){
sigma=sigs[i];X=0.5+sigma*X0;Y=w(X)*f(X)
varz[i]=var(Y);meanz[i]=mean(Y)}
Related Question