Solved – Gaussian Mixture Model – Model selection using the held-out likelihood

finite-mixture-modelgaussian mixture distributionlikelihoodoverfitting

I am trying to understand how to select the number of components in a Gaussian Mixture Model (GMM). Most presentations mention the use of criteria such as AIC and BIC.

But if we simply follow model selection approaches for supervised learning, we could for example perform a cross-validation and estimate the likelihood for each held-out set. Then we choose the model with the highest averaged likelihood. Is this a valid approach for selecting the number of components of a GMM?

Best Answer

I would say using hold-out data set likelihood is a good approach. For Mixture of Gaussians, the more Guassians we have, the better likelihood we can get. Just like the order in polynomial regression problem.

AIC and BIC will penalize on number of parameters (Gaussians) used automatically, but using a separate testing set will also be a good choice. I will use an extreme example to explain, suppose we select number of Gaussian is as same as number of data points in your training set. Your training score will be really good (Infinite likelihood), but the testing score will not be so. Which is as same as other machine learning model selection process.