# Gaussian Mixture Model – Why Can’t MLE Be Implemented Directly?

gaussian mixture distributionlatent-variablemaximum likelihoodself-study

Consider the following density, the mixture of two Gaussian distributions,
\begin{align*} p(x)= p(k=1) N(x|\mu_1,\sigma^2_1) + p(k=0) N(x|\mu_0,\sigma^2_0) , \end{align*}
where $$p(k=1)+p(k=0)=\pi_1+\pi_0=1$$ and $$N(x|\mu,\sigma^2)$$ is the density of Gaussian distribution with mean $$\mu$$ and variance $$\sigma^2$$.
Parameters of interests are $$\pi_0$$, $$\mu_i$$'s and $$\sigma^2_i$$'s.

This Q & A shows the MLE for the mixture of two Gaussian distributions when the latent variables $$K_i$$'s are observed.
In this question, suppose we only observe $$X_i$$'s, and the latent variables $$K_i$$'s are unobserved.
Classical methods for estimation of these $$5$$ unknown parameters are EM-algorithm and MCMC sampling, see Hastie et. al. (2009) for details.

Why cannot MLE be implemented for Gaussian mixture model directly?

(Some attempt)

The log-likelihood would be
\begin{align*} \ln P(x|\theta) = \sum_{i=1}^n \bigg[ (1-k_i) (\ln \pi_0 + \ln N(x_i|\mu_0,\sigma_0^2))+k_i(\ln \pi_1 + \ln N(x_i|\mu_1,\sigma_1^2)) \bigg]. \end{align*}