[Math] What does a `reference measure’ mean in the context of defining Shannon entropy

entropyinformation theoryprobability

I have come across the following definition of Shannon entropy:

The Shannon entropy of a random variable $X$ with distribution $\mu$, with respect to a reference measure $\rho$ is $$H_\rho[X] := – \mathbb{E}_\mu[\log \frac{d\mu}{d\rho}]$$ when $\mu \ll \rho$.

What does a reference measure mean here? How does this compare to the discrete definition $H[X] = -\sum_{x} P[X = x] \log P[X = x]$?

Best Answer

You mean $\mu\ll\rho$ (I made the correction).

For example, if $f$ is the density of an absolutely continuous random variable $X$, then $\rho$ could be the Lebesgue measure so that $$ f=\frac{d\mu}{d\rho} $$ and $$ \begin{split} H_\rho[X] &= - \mathbb{E}_\mu\left[\log \frac{d\mu}{d\rho}\right]\\ &=-\int\log \frac{d\mu}{d\rho}\,d\mu\\ &=-\int \frac{d\mu}{d\rho}\log \frac{d\mu}{d\rho}\,d\rho\\ &=-\int f(x)\log f(x) \, dx. \end{split} $$

As you see this reminds a lot the formula $$ H[X] = -\sum_{x} P[X = x] \log P[X = x], $$ which could in fact be written in the form $H_\rho[X]$ taking $\rho$ to be the usual counting measure (possibly infinite, but that's fine for this purpose).