[Math] Entropy of the product of two random variables

entropyinequalityprobability

Consider a random matrix $X$ and a random vector $Y$. Let the Shannon entropies $H(X) = H(Y) = n$. Is there a simple upper bound for entropy $H(XY)$? I believe $H(XY) \leq 2n$ as that is a simple upper bound for the joint entropy of $(X,Y)$, but is it also the case that $H(XY) \leq n$?


The entries of $X$ need not be independent. $X$ is $m$ by $n$ for some $m \leq n$ and $Y$ has $n$ entries.

Best Answer

Differential entropy

I don't think that you can derive an upper bound for the following reason.

Let's assume $Y \sim \mathcal N(0,\eta I)$ is Gaussian distributed (with $\eta$ chosen such that $H[Y]=n$) and $X \sim \mathcal N(\lambda O, \kappa I)$, where $\kappa$ is chosen such that $H[X]=n$, $O$ a matrix consisting of ones, and $\lambda$ a free parameter which we are going to play around with.

For $\lambda=1$ the entropy of will have some value $H[XY]=c$ . However, since the differential entropy depends on scale, increasing $\lambda$ will also increase $H[XY]$. More specifically, for arbitrary $\lambda$ $$H[XY]=H\left[\frac{1}{\lambda} XY\right]+n\log \lambda=c + n\log \lambda.$$

Since $\lambda$ only changes the mean of $X$, is does not affect its marginal entropy $H[X]$. However, as you can see, the entropy $H[XY]$ can be made arbitrarily large by increasing $\lambda$. Therefore, there is no upper bound in terms of the single marginal entropies.

It might help to know the transformation formula for discrete entropies. Maybe this can make it clearer that the differential entropy changes under a deterministic function $f$.

Consider the function $f:\mathbb{R}^{n}\rightarrow\mathbb{R}^{n}$ and that we want to know $H\left[Y\right]$ in terms of $H\left[X\right]$ where $Y=f(X)$:

\begin{align}H\left[Y\right] &= \left\langle -\log p_{Y}(Y)\right\rangle _{Y}\\ &= \left\langle -\log p_{Y}(f(X))\right\rangle _{X}\\ &= \left\langle -\log\left(\underbrace{p_{Y}(f(X))\left|\frac{dy}{dx}\right|}_{p_{X}(X)}\left|\frac{dy}{dx}\right|^{-1}\right)\right\rangle _{X}\\ &= H\left[X\right]+\left\langle \log\left(\left|\frac{dy}{dx}\right|\right)\right\rangle _{X}. \end{align}

Discrete entropy (after discussion with Will Nelson in the comments) Will was right that for discrete entropy and a deterministic function, one can show that $$H[XY]=H[f(X,Y)]\le H[X] + H[Y].$$ The reason (which can be found on wikipedia) is that $$0 \le H[Z,f(Z)]=H[Z] + H[f(Z)|Z] = H[Z|f(Z)] + H[f(Z)]$$ and the fact that $H[f(Z)|Z]=0$ for deterministic $f$.

How can we apply this to the case above? First of all, we know that $$2n \ge H[X,Y] \ge n.$$ This follows from $H[X,Y] = H[X] + H[Y|X]$, $H[Y|X]\ge 0$ and the fact that conditioning reduces entropy.

So if $X$ and $Y$ are completely dependent, then $H[X,Y] = n \ge H[XY]$. Whether $H[XY]=H[X,Y]=n$ depends on $H[X,Y|XY]=0$ which is the case if the nullspace of $XY$ is zero for all $X$.

So we know now that $H[XY]\le H[X,Y]$ and under which conditions this would be equal to $n$. Of course one could still have $H[X,Y]>n$ and $H[XY]\le n$, but without any further knowledge of the distribution $p(X,Y)$ I have no idea how to show anything better.

Related Question