[Math] Proving a messy inequality

entropyinequalitiesit.information-theory

EDIT:
After much work I was able to reduce the inequality to a single variable function which I need to show is non-positive. That function is (for $0\leq p\leq\frac{1}{2}$)
$$\frac{p^2(\log(p))^2 – (1-p)^2(\log(1-p))^2}{1-2p} + \log(p)\log(1-p) + p\log(p)+(1-p)\log(1-p) \leq 0$$
This looks tractable, but is quite hard to prove since neither the function nor any of its derivatives are convex/concave/monotonic. Its graph looks like

valid xhtml.

Any inputs welcome!

END EDIT

I'm trying to prove this messy inequality I got while working on an information theoretic problem (more details below).

Let $0 \leq p,q \leq \frac{1}{2}$.

Let $H(x) = -x\log(x) – (1-x)\log(1-x)$ be the binary entropy function.

Let $p\star q = p(1-q) +q(1-p)$ (think of this as addition of two Bernoulli random variables in $\mathbb{Z}_2$)

The inequality is:
$$-\frac{(1-2q)}{(p \star q) (1- p \star q)\log\left(\frac{1-p}{p}\right)} + \frac{\log\left(\frac{1- p \star q}{p \star q}\right)}{p(1-p)\left(\log\left(\frac{1-p}{p}\right)\right)^2} – \frac{H(q)(1-2p)}{H(p)(p \star q) (1- p \star q)\log\left(\frac{1-q}{q}\right)} \leq 0$$

Some Remarks:

If we write this as $-A+B-C \leq 0$ with $A,B,C$ being the corresponding quantities above. I can show that $-A +B \geq 0$, so the $-C$ term subtracts enough to make the expression negative. The plot of $-A+B-C$ as a function of $p,q$ looks like

valid xhtml.

It appears to decrease in $q$ for a fixed $p$, starting at $0$ and going to $-\infty$ as $q$ goes to $\frac{1}{2}$.

Origin:
This inequality came from looking at a function $f(x,y) = H(H^{-1}(x)\star H^{-1}(y))$ where $f: [0,1]\times [0,1] \to [0,1]$. This function occasionally pops up in information theory, like when studying the binary symmetric channel or an inequality called Mrs. Gerber's Lemma. The inequality I have implies that $f(x,y)$ is concave along lines through the origin (i.e $f(x,\theta x)$ is concave in $x$). What I've written is a simplified version $f''$ along a line through the origin.

I'm at a complete loss as to how to attack this. All the plots I have show this is true, but a plot is not a proof. Any suggestions welcome!

Best Answer

I think I managed to prove the entire inequality analytically. The whole proof is a bit long to post here (about 7 pages) and involves ugly looking expressions. I'll outline the general strategy I used:

  1. Start with the original inequality $$-\frac{(1-2q)}{(p \star q) (1- p \star q)\log\left(\frac{1-p}{p}\right)} + \frac{\log\left(\frac{1- p \star q}{p \star q}\right)}{p(1-p)\left(\log\left(\frac{1-p}{p}\right)\right)^2} - \frac{H(q)(1-2p)}{H(p)(p \star q) (1- p \star q)\log\left(\frac{1-q}{q}\right)} \leq 0$$ and throw all the $p\star q$ terms on one side to get $$(p \star q)(1- p\star q)\log\left(\frac{1-p \star q}{p \star q}\right) \leq(1-2q)p(1-p)\log\left(\frac{1-p}{p}\right) + \frac{p(1-p)(1-2p)\left(\log\left(\frac{1-p}{p}\right)\right)^2 H(q)}{H(p)\log\left(\frac{1-q}{q}\right)}$$
  2. Now keep $p \star q$ fixed $ = k$ say, and try minimise the right hand side. Notice that when $p = k, q = 0$ we have equality. This might lead us to hope that the right hand side is a decreasing function of $p$, and when $p$ becomes $k$ we have equality.

  3. Now consider the partial derivative of the RHS in $p$, and try to show that this is negative. By some stroke of luck, it turns out that if we look at $\frac{\partial RHS}{\partial p}$ for a fixed $p$ and vary $k$, the value of $k$ which maximises $\frac{\partial RHS}{\partial p}$ is $k = p$. So its sufficient to prove $\frac{\partial RHS}{\partial p}|_{(p,p)} \leq 0$

  4. $\frac{\partial RHS}{\partial p}|_{(p,p)}$ when evaluated gives me the second inequality (the one I added in the EDIT). Thus it suffices to show $$p^2(\log(p))^2 - (1-p)^2(\log(1-p))^2 + (1-2p)\left(\log(p)\log(1-p) + p\log(p)+(1-p)\log(1-p)\right) \leq 0$$

  5. Let me call the above expression $F(p)$. I showed that if $F(p) > 0$ somewhere, then $F'''(p)$ would have atleast $2$ zeros in $[0,0.5]$. (This is easy to show.) Explicitly computing $F'''(p)$ shows that it is $-\infty$ at $0$ and strictly positive at $0.5$. So, if we happened to show that $F'''$ is concave, it would have exactly one zero, and thus contradict the result that it should have $2$, and thus prove the result.

  6. Now I do something truly horrible. I explicitly compute the 5-th derivative and show that it is negative, thus $F'''$ is indeed concave.

  7. The inequality $F'''''(p) \leq 0$ can be somewhat simplified to get $$P_1(p)\log(p) + P_2(p)\log(1-p) + P_3(p) \leq 0$$ where $P_1, P_2,P_3$ are some $7$-th degree polynomials.

  8. I now use Sturm's theorem to show that over the range $[0,0.5]$ $P_1,P_2 \geq 0$. This, too, was quite horrible.

  9. Now I use a polynomial approximation to $\log$, and use the positivity of $P_1,P_2$ to conclude $$P_1(p)\log(p) + P_2(p)\log(1-p) + P_3(p)\leq P_1(p)\left( -(1-p) - \frac{(1-p)^2}{2}\right) + P_2(p)\left(-p - \frac{p^2}{2}\right) + P_3(p)$$

  10. By another stroke of luck, the RHS here factorises to give $$ -24(1-p)^2 (p-0.5)^2 p^2 (p^2-p+\frac{7}{3}) $$ which is clearly seen to be negative, and thus by step $9$ I can now say that $F'''''$ is negative, and $F'''$ is concave, and it has only one root in $[0,0.5]$ and thus $F$ has to be non-positive everywhere in $[0,0.5]$.

In retrospect, I guess using numerical techniques (as advised in the comments section) would have been more prudent.

Related Question