Entropy of a perturbed or weighted binomial random variable

entropyinformation theoryprobabilityrandom variables

Let $W$ be a Binomial random variable taking values on $i=1,2 \cdots n$ with $P(W=i)=\left(\begin{array}{l}
n \\
i
\end{array}\right) p^{i} (1-p)^{n-i}=p_i$.

As per wikipedia, the entropy of $W$ is given by $H(W)=H(\{p_i\})=\frac{1}{2} \log _{2}(2 \pi e n p (1-p))+O\left(\frac{1}{n}\right).$

Now, consider $q_i=e^{-{(\frac{a}{bi+c}})}$,where $a,b,c$ are constants $\geq 0$.

How can I compute ( atleast approximately) the entropy $H\left(\frac{\left\{p_{i} q_{i}\right\}}{d}\right),$ where $d=\sum p_iq_i$ is the normalizing factor?

The context of this question comes from this problem. My aim is to compute
$G=d H\left(\frac{\left\{p_{i} q_{i}\right\}}{d}\right)+(1-d) H\left(\frac{\left\{p_{i}\left(1-q_{i}\right)\right\}}{1-d}\right)$ eventually.

Subquestion: I am also interested in knowing if $G$ is increasing with $c$ whereas decreasing with $b$.

Best Answer

Let $D$ be the entropy difference:$$ D = H(\{p_i\}) -G = H(\{p_i\}) - r H\left(\frac{\left\{p_{i} q_{i}\right\}}{r}\right)-(1-r) H\left(\frac{\left\{p_{i}\left(1-q_{i}\right)\right\}}{1-r}\right) \tag 1$$

(I use $r=\sum q_i p_i$ instead of $d$ because of readability)

Using natural logarithms (entropy in nats), letting $\delta_i = p'_i-p_i=\frac{q_i p_i}{r}-p_i$, doing a second order Taylor expansion around $\delta_i=0$ (the first order term vanishes) I get:

$$D \approx \frac{1}{ 2r(1-r)} \sum_i p_i (r - q_i)^2=\frac{\sum_i p_i q_i^2 - r^2 }{ 2r(1-r)} \tag 2$$

This approximation should be good for small $\delta_i$, that is, for $q_i$ nearly constant.

For example: for $p = (0.1, 0.2, 0.3, 0.25, 0.15)$ and $q=(0.3, 0.32, 0.34, 0.36, 0.38)$ I get $D= 0.0012702$ while the approximation gives $0.0012669$.

The approximation can be also be written as

$$D \approx \frac{\sigma^2}{2 \mu (1-\mu)} $$ where $\mu, \sigma^2$ are the mean and variance of a random variable taking values $q_i$ with probabilities $p_i$. Also, this can be obtained by doing a Taylor expansion of the binary entropy function, around $\mu$, in the first equation in the other answer.

Best Answer

Related Solutions

Why Shannon’s Entropy Uses Logarithms – Information Theory

Differential entropy vs Kolmogorov-Sinai “partition trick”

Related Question