Solved – Computing Gaussian mixture model probabilities

mixture-distributionnormal distributionprobability

I was looking over the solution to this question on SO and it got me thinking about computing probabilities for a Gaussian mixture model.

Let's assume you've fit some Gaussian mixture model so that it results in a mixture of three normals:

\begin{equation}
X_1 \sim \mathcal{N}(\mu_1,\sigma_{1}^{2}), \quad X_2 \sim \mathcal{N}(\mu_2,\sigma_{2}^{2}), \quad X_3 \sim \mathcal{N}(\mu_3,\sigma_3^2)
\end{equation}
with respective weights $\lambda_1, \lambda_2$, and $\lambda_3$. From here on out, take $\mathbf{X}=[X_1,X_2,X_3]$ and $\mathbf{\lambda}=[\lambda_1,\lambda_2,\lambda_3]$.

Typically, to find the probability that this model is less than some value $x$, we find
\begin{equation}
\mathbf{P}(\mathbf{\lambda}\mathbf{X}^T\leq x) = \sum_{i=1}^{3} \lambda_i \mathbf{P}(X_i\leq x)
\end{equation}

It's possible that I've made a coding error, but it seems that the probability obtained from the above formula is different from the probability obtained if we compute the probability in a different way:
\begin{equation}
\mathbf{P}(\mathbf{\lambda}\mathbf{X}^T\leq x)=\mathbf{P}(Y\leq x)
\end{equation}
where $Y\sim\mathcal{N}(\mathbf{\lambda}\mathbf{\mu}^T,\sum_{i=1}^{3} \lambda_{i}^{2}\sigma_{i}^{2})$ with $\mathbf{\mu}=[\mu_1,\mu_2,\mu_3]$.

If it's a coding error, please just leave a comment and I will delete this question.

Best Answer

Yes, the two probabilities ought to be different, because one is for a mixture and the other is for a sum. Look at an example:

mixture and sum PDFs

The thick red curve is the probability density function for a mixture of three normals ($X$). The dashed curves are its components (each scaled by $\lambda_i$); they are normal. The thick blue curve is the pdf of the normal distribution with the weighted mean and weighted variance that define $Y$; it, too, is normal. In particular, note that the possibility of the mixture having multiple modes (three in this case, between one and three in general) makes it perfectly clear the mixture is not normal in general, because normal distributions are unimodal.

The mixture can be modeled as a two step process: first draw one of the three ordered pairs $(\mu_1, \sigma_1)$, $(\mu_2, \sigma_2)$, and $(\mu_3, \sigma_3)$ with probabilities $\lambda_1$, $\lambda_2$, and $\lambda_3$, respectively. Then draw a value $X$ from the normal distribution specified by the parameters you drew (understood as mean and standard deviation).

The weighted mean is obtained from a completely different procedure: independently draw a value $X_1$ from a normal distribution with parameters $(\lambda_1 \mu_1, \lambda_1 \sigma_1)$, a value $X_2$ from a normal distribution with parameters $(\lambda_2 \mu_2, \lambda_2 \sigma_2)$, and a value $X_3$ from a normal distribution with parameters $(\lambda_3 \mu_3, \lambda_3 \sigma_3)$. Then form their sum $Y = X_1+X_2+X_3$.

Related Question