[Math] Derivation of the maximum entropy distribution

entropyprobability distributions

I am reading a book and having trouble following something. The problem is to try to maximize the differential entropy $-\int_{0}^{\infty}p(r)\log p(r)$ under the constraints that $\int_{0}^{\infty}p(r)=1$ and $\int_{0}^{\infty}rp(r)=\mu_{r}$. The book then just straightaway states that the result is an exponential distribution. I tried to prove this to myself but got stuck, it has been a while since I did any math so I am pretty sure I am missing something very basic.

Here's how far I have gotten. I first wrote the lagrangian

$L=-\int_{0}^{\infty}p(r)\log p(r)+\lambda_{1}\left(\int_{0}^{\infty}p(r)-1\right)+\lambda_{2}\left(\int_{0}^{\infty}rp(r)-\mu_{r}\right)\ \ \ \ \ \ \ \ (1)$

Now here is where I started getting a bit lost, I wasn't sure how to differentiate under the integral so I assumed that the integral was actually a sum over $p(r_{i})$ and differentiated $L$ w.r.t to $p(r_{i})$. Is this correct? What is the proper and general way to solve such differentials?

Moving along, after differentiation (assuming it was a sum) I got the following

$\frac{\partial L}{\partial p}=-\log p(r)-1+\lambda_{1}+\lambda_{2}r=0$

$\implies p(r)=e^{\lambda_{2}r+\lambda_{1}-1}\ \ \ \ \ \ \ \ (2)$

$\frac{\partial L}{\partial\lambda_{1}}=\int_{0}^{\infty}p(r)-1=0$

$\implies\int_{0}^{\infty}e^{\lambda_{2}r+\lambda_{1}-1}dr=1\ \ \ \ \ \ \ \ (3)$

$\frac{\partial L}{\partial\lambda_{2}}=\int_{0}^{\infty}rp(r)-1=0$

$\implies\int_{0}^{\infty}re^{\lambda_{2}r+\lambda_{1}-1}dr=1\ \ \ \ \ \ \ \ (4)$

But now as per my knowledge, the definite integrals in (3) and (4) should equal infinity, but that doesn't help me and so I am stuck? I am sure I am making some very basic conceptual mistake. Can someone please help me with this solution? Thanks.

Best Answer

$\int_0^\infty \mathrm e^{-a x} \mathrm d x = \frac{1}{a}$ for $a>0$.

$\int_0^\infty x \mathrm e^{-a x} \mathrm d x$ may be evaluated by integration by parts.

Related Question