Solved – EM Algorithm – Derivative wrt covariance

expectation-maximization

I am struggling trying to find the derivative of the expression below $\frac{\partial}{\partial \sum_k^{-1}} $ wrt the covariance matrix $\sum_k^{-1}$

$ \max \sum_{n=1}^N \sum_{k=1}^K q_{kn} \log ( \frac{\pi_k}{q_{kn}} \times \frac{1}{\sqrt{2\pi |\sum_k|}}e^{-\frac{1}{2}(x_n – \mu_k)^T\sum_k^{-1}(x_n – \mu_k)}) $

Below is the solution. I was hoping someone could show me the way.

enter image description here

Best Answer

This answer and this tutorial give the derivatives of log multivariate Gaussians, the rest should be easy.

$$Q=\sum_n\sum_kq_{kn}[\log\pi_k-\frac{k}{2}\log 2\pi -\frac{1}{2}\log\mid\Sigma_k\mid -\frac{1}{2}(x_n-\mu_k)^T\Sigma_k^{-1}(x_n-\mu_k)]$$

$$\frac{\partial Q}{\partial\mu_k}=\sum_nq_{kn} \frac{-\frac{1}{2}\partial (x_n-\mu_k)^T\Sigma_k^{-1}(x_n-\mu_k)}{\partial\mu_k}=\sum_nq_{kn}\Sigma_k^{-1}(x_n-\mu_k)$$ $$\frac{\partial Q}{\partial\Sigma_k}=\sum_nq_{kn} \frac{-\frac{1}{2}\partial\log\mid\Sigma_k\mid -\frac{1}{2}\partial(x_n-\mu_k)^T\Sigma_k^{-1}(x_n-\mu_k)}{\partial\Sigma_k}$$$$=-\frac{1}{2}\sum_nq_{kn}(\Sigma_k^{-1}-\Sigma_k^{-1}(x_n-\mu_k)(x_n-\mu_k)^T\Sigma_k^{-1})$$

Setting the derivatives to zero we can get the desired results. $$-\frac{1}{2}\sum_nq_{kn}(\Sigma_k^{-1}-\Sigma_k^{-1}(x_n-\mu_k)(x_n-\mu_k)^T\Sigma_k^{-1})=0$$ $$-\frac{1}{2}\sum_nq_{kn}\Sigma_k(\Sigma_k^{-1}-\Sigma_k^{-1}(x_n-\mu_k)(x_n-\mu_k)^T\Sigma_k^{-1})\Sigma_k=0$$ $$-\frac{1}{2}\sum_nq_{kn}(\Sigma_k-(x_n-\mu_k)(x_n-\mu_k)^T)=0$$ $$\sum_nq_{kn}\Sigma_k-\sum_nq_{kn}(x_n-\mu_k)(x_n-\mu_k)^T=0$$ $$\Sigma_k=\frac{\sum_nq_{kn}(x_n-\mu_k)(x_n-\mu_k)^T}{\sum_nq_{kn}}$$