I have a vector x
, lower_bound < x < upper_bound
. I would like to fit a truncated normal mixture distribution to x
. I can use the package mixtools to fit a normal mixture:
library(mixtools)
mix_fit <- normalmixEM(x)
but that does not account for the upper and lower bounds. Is there any package in R that fits truncated normal mixtures? Otherwise, I guess I'd have to implement my own EM. So if no packages have this functionality, I'd welcome any good references on implementation details for that.
Best Answer
A direct approach to estimating a mixture of two $(a,b)$ truncated Normal distributions $$f(x;\boldsymbol{\theta})=\varpi_1 \varphi(x;\mu_1,\sigma_1,a,b)+(1-\varpi_1)\varphi(x;\mu_0,\sigma_0,a,b)$$ where $$\varphi(x;\mu_1,\sigma_1,a,b)=\dfrac{\exp\{-(x-\mu_1)^2/2\sigma_1^2\}}{\sqrt{2\pi}\sigma_1[\Phi(\{b-\mu_1\}/\sigma_1)-\Phi(\{a-\mu_1\}/\sigma_1)]}$$ is to use the complete likelihood $$\prod_{i=1}^n [\varpi_1 \varphi(x_i;\mu_1,\sigma_1,a,b)]^{z_i}[(1-\varpi_1)\varphi(x_i;\mu_0,\sigma_0,a,b)]^{1-z_i}$$ with E target $$\sum_{i=1}^n \mathbb E[Z_i|x_i,\boldsymbol{\theta}^-]\log [\varpi_1 \varphi(x_i;\mu_1,\sigma_1,a,b)]+\\\sum_{i=1}^n \mathbb E[1-Z_i|x_i,\boldsymbol{\theta}^-] \log [(1-\varpi_1) \varphi(x_i;\mu_0,\sigma_0,a,b)] $$ where $$\mathbb E[Z_i|x_i,\boldsymbol{\theta}]=\dfrac{ \varpi_1 \varphi(x_i;\mu_1,\sigma_1,a,b)}{f(x_i;\boldsymbol{\theta})}$$ which involves in the M step $$\varpi_1^+ = \frac{1}{n}\sum_{i=1}^n \mathbb E[Z_i|x_i,\boldsymbol{\theta}]$$ and $$(\mu_1^+,\mu_0^+,\sigma_1^+,\sigma_0^+) = \arg\max\sum_{i=1}^n \mathbb E[Z_i|x_i,\boldsymbol{\theta}^-]\log \varphi(x_i;\mu_1,\sigma_1,a,b)+\\\sum_{i=1}^n \mathbb E[1-Z_i|x_i,\boldsymbol{\theta}^-] \log \varphi(x_i;\mu_0,\sigma_0,a,b) $$ Unfortunately, this optimisation is not feasible in an analytical form.
A potentially interesting alternative is to add to the observed sample $(x_1,\ldots,x_n)$ a latent sample $$(Y_1,\ldots,Y_{N_1},W_1,\ldots,W_{N_2})$$ such that