For a Fixed Variance, Gaussian Distribution Maximizes Entropy

calculusentropygaussian-integrallagrange multiplier

I was reading this paper. In page 5, second column, they mention that,

$$h(Q) + h(P) \ge log(e \pi) => \sigma(Q) * \sigma(P) >= \frac{1}{2}$$

Where entropy $h$ is defined in the following way:
$$h(Q) = – \int_{-\infty}^{\infty} \Gamma(q) * log( \Gamma(q)) dq$$.

Then it was mentioned that if $\Gamma(q)$ follows a Gaussian distribution, then for a fixed variance, the entropy $h(Q)$ will be maximized. They said it can be shown by variational calculus with Lagrange multipliers. Could anyone please give a hint how do I proceed to prove it?

Best Answer

First, some properties of $\Gamma(q)$. It is a probability distribution so $$ \int_{-\infty}^\infty \Gamma(q) \,dq = 1 $$

We can without loss of generality assume we have shifted the variable $q$ such that the mean is zero, so that $$ \int_{-\infty}^\infty q\, \Gamma(q)\, dq = 0 $$ We can rescale $q$ in a linear way so that the "fixed variance" is one. $$ \int_{-\infty}^\infty q^2\, \Gamma(q)\, dq = 1 $$ Subject to those three constraints we wish to find an extremum of $$\int_{-\infty}^\infty \Gamma(q)\, \log(\Gamma(q))\,dq $$ We re-write the fixed variance constraint in such a way that we can ignore the first constraint until the end of the problem, and normalize the resulting function at the end. Namely, $$ \int_{-\infty}^\infty q^2\, \Gamma(q)\, dq = \int_{-\infty}^\infty \Gamma(q) \,dq $$ So we have Lagrange multipliers $\lambda_1, \lambda_2$ associated with the mean and variance, and out Euler equation is $$ \frac\partial{d\Gamma} \left( \Gamma(q)\, \log(\Gamma(q)) \right)+\lambda_1 \frac\partial{d\Gamma}\left( q\, \Gamma(q)\right)+\lambda_2 \frac\partial{d\Gamma}\left( q^2\, \Gamma(q) -\Gamma(q)\right) =0 $$ or $$ 1+\log(\Gamma(q)) + \lambda_1q + \lambda_2 (q^2-1) = 0 $$ which says that any extremum must be of the form $$ \Gamma(q) = e^{\lambda_2 q^2+ (\lambda_1-\lambda_2)q + (1-\lambda_2)} $$ and now we have to choose $\lambda_1$ and $\lambda_2$ such that the mean and variance will come out right (which is something like $\lambda_1 = \lambda_2 = -\sigma^2/2)$ and finally normalize the function because we had punted the constraint that the probability integrates to $1$.

Best Answer

Related Solutions

[Math] Does a maximum entropy probability distribution with KL-divergence constraint not exist

[Math] How to maximize entropy

Related Question