Proving that the conditional entropy of a probability measure is concave

ergodic-theorymeasure-theoryprobability theory

Let $\mu$ be a probability measure on $\mathcal{X}$ and let $\mathcal{E}, \mathcal{F}$ be countable partitions of the space. Define the entropy of $\mu$ with respect to the partition $\mathcal{E}$ as
$$
H(\mu, \mathcal{E}) = – \sum_{E \in \mathcal{E}} \mu(E) \log \mu(E)
$$

and the conditional entropy as
$$
H(\mu, \mathcal{E} | \mathcal{F}) = \sum_{F \in \mathcal{F}} \mu(F) H(\mu_F, \mathcal{E}) = – \sum_{F \in \mathcal{F}} \sum_{E \in \mathcal{E}} \mu_{|F}(E) \log (\dfrac{1}{\mu(F)} \mu_{|F}(E)),
$$

where $\mu_{|F}(\cdot) = \mu(\cdot \cap F)$ and $\mu_F$ is the normalized restriction of $\mu$ on $F \in \mathcal{F}$.

Now it is easy to see by concavity of $x \mapsto -x \log x$ that the entropy $\mu \mapsto H(\mu, \mathcal{E})$ is a concave function, but what about the conditional entropy? How can I prove its concavity? The normalizing coefficient $\dfrac{1}{\mu(F)}$ seems to make the function slightly more complicated.

Best Answer

Note that if we define $$\mu_F(X)=\frac{\mu(X\cap F)}{\mu(F)}$$ where $X$ is a measurable subset of $\mathcal{X}$ and $F\in \mathcal{F}$, then $\mu_F$ is also a probability measure. Let $\omega =c\mu +(1-c)\nu$ where $\omega,\mu,\nu$ are probability measures and $c\in (0,1)$. By convexity of the entropy and the fact that $$\begin{eqnarray}\omega_F (X)&=&\frac{c\mu(X\cap F)+(1-c)\nu(X\cap F)}{c\mu(F)+(1-c)\nu(F)}\\&=&\frac{c\mu(F)\cdot \mu_F(X)}{c\mu(F)+(1-c)\nu(F)}+\frac{(1-c)\nu(F)\cdot\nu_F(X)}{c\mu(F)+(1-c)\nu(F)} \end{eqnarray}$$ i.e. $\omega_F$ is a convex combination of $\mu_F$ and $\nu_F$, we have $$\begin{eqnarray} H(\omega_F, \mathcal{E})&\ge& \frac{c\mu(F)}{c\mu(F)+(1-c)\nu(F)}H(\mu_F, \mathcal{E})+\frac{(1-c)\nu(F)}{c\mu(F)+(1-c)\nu(F)}H(\nu_F, \mathcal{E})\\ &=&\frac{c\mu(F)}{\omega(F)}H(\mu_F, \mathcal{E})+\frac{(1-c)\nu(F)}{\omega(F)}H(\nu_F, \mathcal{E}) \end{eqnarray}$$ for $\omega(F)>0$. This implies $$\begin{eqnarray} H(\omega, \mathcal{E} | \mathcal{F}) &=& \sum_{F \in \mathcal{F}} \omega(F) H(\omega_F, \mathcal{E}) \\ &\ge&\sum_{F \in \mathcal{F}}c\mu(F)H(\mu_F, \mathcal{E})+(1-c)\nu(F)H(\nu_F, \mathcal{E})\\ &=&cH(\mu, \mathcal{E} | \mathcal{F})+(1-c)H(\nu, \mathcal{E} | \mathcal{F}) \end{eqnarray}$$ as desired.

Related Question