Random Number Generation – Draw Random Numbers from Finite Mixture Model

finite-mixture-modelrandom-generation

Setup

Let X follow a fininte mixture model with density
\begin{equation}
f=\lambda f_1+(1-\lambda) f_2
\end{equation}

Where $f_1$ and $f_2$ are both log-normal densities with parameters $(\mu_1, \sigma_1)$, and $(\mu_2, \sigma_2)$ respectively and $\lambda \in [0,1]$ is the mixing probability. Let $Z_i$ be a random variable with density $f_i$, i=1,2

I want to generate a set of N realisations from the random variable X.

In practice I saw two ways to do this:

Method 1

For each draw: choose with probability $\lambda$ a realisation from $Z_1$ and with probability $(1-\lambda)$ a realisation from $Z_2$.

Method 2

Draw N realisations from $Z_1$ and N realisations from $Z_2$, to obtain the two vectors $\mathbf{z}_1=(z_{11},z_{21}, …z_{N1})^T $ and $\mathbf{z}_2=(z_{12},z_{22}, …z_{N2})^T$.

Then obtain the desired vector of N realisations as

\begin{equation}
\mathbf{x}=\lambda \mathbf{z_1}+(1-\lambda) \mathbf{z_2}
\end{equation}

Question

I find Method 1 intuitive, but Method 2 seems strange. If we would impose limited support for $f_1$ and $f_2$ and allow, for example, only integer values, then the draws of X would contain values which are not in the support of either $Z_1$ or $Z_2$. This said, I did some simulations, and the histograms I plotted are identical. Hence, the question:

Are Method 1 and Method 2 asymptotically equivalent? Are they equivalent in small sample as well? Is one of the two methods just wrong?

Thanks a lot for your help!

Best Answer

The second method is not only strange but wrong. The first method is correct. The definition of mixture distribution says that the $\lambda$ and $1-\lambda$ weights are applied to the distribution functions, not to the random variables*. To convince yourself, try both methods. You can find such an example below. The example uses a mixture of Gaussians, where the probability density function is shown in red. As you can see, in the first case the histogram matched the probability density function. In the second case, the histogram of the samples is unimodal vs the target distribution is bimodal, so it's clearly wrong. In fact, we know that sum of Gaussians is Gaussian, and this is what we see on the second plot.

Two histograms described above.

If you wanted to simplify the sampling, you can notice that the number of the samples $n_1$ from $Z_1$ would follow a binomial distribution with a probability of success $\lambda$ and the sample size $N$, so you can draw $n_1$ from the binomial distribution and then take $n_1$ samples for $Z_1$ and $N-n_1$ for $Z_2$ and they would together form a sample from the mixture of size $N$.

* - Think of a trivial example where we multiply Bernoulli distributed random variable by $2$. If you multiplied the values, you have a random variable with two possible values $0$ and $2$ occurring with probabilities $p$ and $1-p$. If you multiplied the probability distribution function by $2$, you would have two values $0$ and $1$ with probabilities that $2p$ and $2(1-p)$ which can be higher than $1$ so invalid. Multiplying the distribution vs the random variable are different things.

Related Question