Gaussian Distribution – Entropy of the Multivariate Gaussian

entropygaussian-integralmultivariable-calculus

Show that the entropy of the multivariate Gaussian $N(x|\mu,\Sigma)$ is given by
\begin{align}
H[x] = \frac12\ln|\Sigma| + \frac{D}{2}(1 + \ln(2\pi))
\end{align}
where $D$ is the dimensionality of $x$.

My solution.

Entropy for normal distribution:

\begin{align}
H[x] = -\int_{-\infty}^{+\infty}N(x|\mu,\Sigma)\ln(N(x|\mu,\Sigma)) dx = &&\text{by definition of entropy}\\
= -E[\ln(N(x|\mu,\Sigma))] =\\
= -E[\ln((2\pi)^{-\frac{D}{2}} |\Sigma|^{-\frac12} e^{-\frac12(x – \mu)^T\Sigma^{-1}(x – \mu)})] = &&\text{definition of multivariable gaussian}\\
= \frac{D}{2}\ln(2\pi) + \frac12\ln |\Sigma| + \frac12E[(x – \mu)^T\Sigma^{-1}(x – \mu)] &&\text{the log of a product is the sum of the logs}.
\end{align}

Consider the third term:

\begin{align}
\frac12E[(x – \mu)^T\Sigma^{-1}(x – \mu)] = \\
= \frac12E[x^T\Sigma^{-1}x – x^T\Sigma^{-1}\mu – \mu^T\Sigma^{-1}x + \mu^T\Sigma^{-1}\mu] = \\
= \frac12E[x^T\Sigma^{-1}x] – \frac12E[2\mu^T\Sigma^{-1}x] + \frac12E[\mu^T\Sigma^{-1}\mu] = \\
= \frac12E[x^T\Sigma^{-1}x] – \mu^T\Sigma^{-1}E[x] + \frac12\mu^T\Sigma^{-1}\mu = \\
= \frac12E[x^T\Sigma^{-1}x] – \mu^T\Sigma^{-1}\mu + \frac12\mu^T\Sigma^{-1}\mu = &&\text{Since $E[x] = \mu$}\\
= \frac12E[x^T\Sigma^{-1}x] – \frac12\mu^T\Sigma^{-1}\mu
\end{align}

How can I simplify the term: $E[x^T\Sigma^{-1}x]$ ?

Best Answer

It's better to simplify the term $\mathbb{E}[(x-\mu)^T \Sigma^{-1}(x-\mu)]$ directly:

$$ \begin{align} \mathbb{E}[(x-\mu)^T \Sigma^{-1}(x-\mu)] &= \mathbb{E}[\mathrm{tr}((x-\mu)^T \Sigma^{-1}(x-\mu))]\\ &= \mathbb{E}[\mathrm{tr}(\Sigma^{-1}(x-\mu)(x-\mu)^T)]\\ &= \mathrm{tr}(\mathbb{E}[\Sigma^{-1}(x-\mu)(x-\mu)^T])\\ &= \mathrm{tr}(\Sigma^{-1}\mathbb{E}[(x-\mu)(x-\mu)^T])\\ &= \mathrm{tr}(\Sigma^{-1}\Sigma)\\ &= \mathrm{tr}(I)=D \end{align} $$