I trying to experiment with law of total variance in order to empirically recreate theoretical results.
In particular I am interested in verifying that:
$$
Var(Y) = E[Var(Y|X)] + Var(E[Y|X])
$$
Let's assume I have the following random variables:
$$
X \sim Gamma(\alpha, \beta)
$$
$$
Y \sim Poisson(X)
$$
The total variance of Y should be equal to:
$$
Var(Y) = E[Var(Y|X)] + Var(E[Y|X]) = E[X] + Var(X) = \alpha*\beta + \alpha*\beta^2
$$
This follow from $E[X] = \alpha*\beta$, $Var(X) = \alpha*\beta^2$, $E[Y|X] = Var(Y|X) = X$, which are known results for the Gamma and Poisson distribution.
I tried to recreate such results in R with the following code:
alpha <- 250
beta <- 10
n <- 10000
X <- rgamma(n, shape = alpha, scale = beta)
Y <- replicate(n, rpois(1, sample(X, 1)))
alpha * beta + alpha * beta^2
[1] 27500
mean(X) + var(X)
[1] 27552.47
var(Y)
[1] 26610.7
In this situation the total variance is 27500
and when I calculate it manually with the previous results it is, in fact, 27500
(or close enough).
My question is, why when I do var(Y)
I get a wrong result? Am I defining the variable Y
in the right way?
Thank you
Best Answer
Your estimator is quite close to the true variance, but it appears you might need a higher sample size to get as close as you want. (You also forgot to set the seed to get a reproducible analysis.) I'm going to try it with a larger number of samples: