Solved – How to “split” Gaussian mixture components when training EM/GMM based classifier

expectation-maximizationgaussian mixture distributionmachine learning

In order to improve performance of my Gaussian Mixture Model based classifier, I was recommended to start with a single multivariate Gaussian, estimate its parameters, and "split" it into two mixtures, reestimate their parameters after several training cycles, and check the classification performance on my cross-validation set, and iterate the process for all the 'split' mixture components.

However I don't really understand what it means to "split" the mixture components and how it is done.

Any help?

Best Answer

After fitting the first Gaussian component, you have its mean and covariance. To split it just means to create a new model with two components instead of one. I'd suggest the new components should have the same mean as the first, but with half the covariance added or subtracted, i.e. $\mu_1 = \mu_0 + 0.5 \sigma_0$ and $\mu_2 = \mu_0 - 0.5 \sigma_0$. Intuitively, this shares the data roughly equally between the two new components. Then re-optimise the two-component model and repeat. At each stage, you could take the component with the largest covariance and replace it with two components.

[EDIT] The exact method to initialise the component shouldn't be critical, as the parameters will be optimised further anyway. See also Ueda et al. 2000 for example. One option they suggest is to define the new means by random peturbations of the old component's mean and with unit covariance.