Deriving MLE of $\mu$ in Multivariate Gaussian Distribution

calculuslogarithmsmaximum likelihoodnormal distributionprobability

Let's consider the gaussian distribution which the covariance matrix is known, suppose we have $D$-dimensional $N$ data $y_n \in \mathbb{R}^D$.
And the likelihood function of $\mu$ is $$L(\mu)=\Pi_{n=1}^N \frac{1}{\sqrt{2\pi^D |\Sigma|}}exp{(-\frac{1}{2}(y_n-\mu)^T\Sigma^{-1}(y_n-\mu))}$$
and take the logarithm of both sides,

$$logL(\mu)=-\frac{N}{2}log(2\pi^D|\Sigma|)-\sum_{n=1}^N \frac{1}{2}(y_n-\mu)^T\Sigma^{-1}(y_n-\mu)$$
And my question is, how should I compute the $\hat{\mu}_{ML}$?
Thank you in advance.

Best Answer

Let $Y_1,\ldots,Y_n$ be i.i.d. $\mathsf N_D\left(\mu,\Sigma\right).$ The likelihood function is given by

$$L(\mu,\Sigma )=\left(\frac{1}{\sqrt{2\pi}}\right)^{nD}\frac{1}{|\Sigma|^{n/2}}\text{exp}\left(-\frac{1}{2}\sum_{i=1}^n(Y_i-\mu)'\Sigma^{-1}(Y_i-\mu)\right)$$

Maximizing $L$ with respect to $\mu$ and $\Sigma$ is equivalent to minimizing $-2\text{ log}(L)$ with respect to $\mu$ and $\Sigma.$

Now

$$\begin{align*} -2\text{ log}L(\mu,\Sigma) &=\sum_{i=1}^n(Y_i-\mu)'\Sigma^{-1}(Y_i-\mu)+n\text{ log}(|\Sigma|)+nD\text{ log}(2\pi)\\\\ &=n\cdot{\bf{tr}}\left(\Sigma^{-1}\bar{S}\right)+n\left(\bar{Y}-\mu\right)'\Sigma^{-1}\left(\bar{Y}-\mu\right)+n\text{ log}(|\Sigma|)+nD\text{ log}(2\pi) \end{align*}$$

where $\bar{S}=\frac{n-1}{n}S$ and $S$ is the sample variance-covariance matrix.

From here, it is clear that if we minimize $-2\text{ log}(L)$ with respect to $\mu$, the minimum occurs at $\hat{\mu}=\bar{Y}$