bayesian – Why Normal Mixture Models Are Not Identifiable and Why It Matters

bayesianmodel

Suppose that we have a mixture model:

$$
p_\theta(y) = \sum_{k = 1}^{K}w_k \phi(y;\mu_k, \sigma^2_k)
$$

where $\phi(y;\mu_k, \sigma^2_k)$ is the normal density at $y$ with mean $\mu$ and variance $\sigma^2$. $\theta$ contains the weights, means, and variances. Why is it the case here that the model is not identifiable?

I know by definition not being identifiable just means that:

$$
\forall \theta_1 \neq \theta_2 \ \ \ \ , \ \ \ p_{\theta_1} =p_{\theta_2}
$$

What is an explicit example here where this holds or why is it bad if it is unidentifiable? Thanks.

Best Answer

Consider the case where $\theta_1 = (w_1=0.5, \mu_1 = 0, \sigma_1^2 = 1)$ and $\theta_2 = (w_2=0.5, \mu_2 = 1, \sigma_2^2 = 1)$ We get exactly the same fit to the data if $\theta_1 = (w_1=0.5, \mu_1 = 1, \sigma_1^2 = 1)$ and $\theta_2 = (w_2=0.5, \mu_2 = 0, \sigma_2^2 = 1)$ Thus, there is no way to empirically learn the value of $\mu_1$ regardless of the amount of data (i.e., it is not identified).

In this case, the absence of identifiability is not "bad", as the real problem to be solved is estimating the parameters, and whether we choose one component of the mixture the first or second component is immaterial.

Related Question