I know when the mean and variance of $\ln x$ are both fixed, then the maximum entropy probability distribution is lognormal. When the mean of a random variable is fixed the MEPD is the exponential distribution. My question is, what is the MEPD in the continuous case when neither the mean or variance are fixed with support on $[0, \infty)$?
[Math] Maximum Entropy Distribution When Mean and Variance are Not Fixed with Positive Support
bayesianprobabilityprobability distributions
Related Solutions
Solving the Lagrange Equations, we get that the maximum entropy distribution with mean $0$ and variance $1$ is where $$ \sum_{k\in\mathbb{Z}}(k^2-1)e^{-ak^2}=0 $$ which is $a\doteq0.4999998943842821\sim\frac12$. We need to compute the coefficient where $$ c\sum_{k\in\mathbb{Z}}e^{-ak^2}=1 $$ which is $c\doteq0.3989422361322933\sim0.3989422804014327=\frac1{\sqrt{2\pi}}$.
Thus, the maximum entropy distribution on the integers that has a mean of $0$ and variance of $1$, is $$ p_k=c\,e^{-ak^2} $$ where $a$ and $c$ are given above. These values are extremely close to the Gaussian, which has the maximum entropy for a continuous distribution with the same constraints.
Although the function derived above is very close to the Gaussian distribution restricted to $\mathbb{Z}$, $\frac1{\sqrt{2\pi}}e^{-n^2/2}$ is not a probability measure on $\mathbb{Z}$. In fact, the Poisson Summation Formula says that $$ \begin{align} \frac1{\sqrt{2\pi}}\sum_{n\in\mathbb{Z}}e^{-n^2/2} &=1+2\sum_{n=1}^\infty e^{-2\pi^2n^2}\\ &\gt1 \end{align} $$
I believe the second paper you cited (by Harremoës) is actually the answer you're looking for. The Poisson distribution describes the number of occurrences of an event in a fixed interval, under the assumption that occurrences are independent. In particular, the constraint that the events should be independent means that not every discrete distribution is a valid candidate for describing this system, and motivates the choice of the union of infinite Bernoulli variables. Then, Harremoës shows that if you further constrain the expected value (i.e., $\lambda$), then the maximum entropy distribution is the Poisson distribution.
So, the Poisson distribution is the maximum entropy distribution given constraints of counting independent events and having a known expected value.
That said, you can also easily reverse-engineer a (contrived) constraint for which the Poisson distribution would be the maximum entropy distribution.
Let our unknown constraint be $\mathbb{E}[f(k)] = c$. Maximizing the entropy with this constraint, along with the mean being $\lambda$, gives the minimization problem
$\sum_k p(k) \ln p(k) - \alpha \left( \sum_k p(k) - 1\right) - \beta\left(\sum_k k p(k) - \lambda\right) - \gamma \left( \sum_k p(k)f(k) - c \right)$,
where $\alpha$, $\beta$, and $\gamma$ are Lagrange multipliers. Taking the derivative with respect to $p(k)$ yields
$\ln p(k) = -1 + \alpha + \beta k + \gamma f(k)$,
We already know the Poisson distribution has the form $p(k) = e^{-\lambda}\lambda^k/k!$, or $\ln(p(k)) = -\lambda + k \ln(\lambda) - \ln(k!)$. Therefore, we can guess that $f(k)$ has the functional form $\ln(k!)$.
So, the Poisson distribution maximizes entropy when $p$ has mean $\lambda$ and $\mathbb{E}(\ln k!) = $[some particular value depending on $\lambda$].
This approach may not be very satisfying, since it's not clear why we would want a distribution with a specified expectation value of $\ln k!$. The Johnson paper you cited is (in my opinion) similarly unsatisfying, since it essentially proves that the Poisson distribution is the maximal entropy distribution among distributions which are "more log-convex than the Poisson distribution".
Best Answer
In the discrete case you need to consider the functional
$$H[p]=-\sum_{i=1}^n p_i \ln(p_i)+\lambda(\sum_{i=1}^n p_i-1)$$
as we consider a single constraint.
Setting $\frac{\partial H[p]}{\partial p_i}=0$ for all $i=1,\dots,n$ we arrive at
$$-\ln(p_i)-1+\lambda=0\Leftrightarrow p_i=e^{\lambda-1}.$$
Imposing $\sum_{i=1}^n p_i-1=0$, one gets
$\lambda=1-\ln(n)$, or $p_i=e^{1-\ln(n)-1}=\frac{1}{n}$.
In summary, the wished distribution is the uniform probability distribution.
The continuous case needs more care, due to non trivial integration range. We want to maximize the functional
$$H[p]=-\int_{0}^{\infty}p(x)\ln(p(x))dx+\lambda(\int_{0}^{\infty}p(x)dx-1), $$
where $p$ has support $[0,\infty]$ and $p(0)=p(\infty)=0$. We apply the calculus of variations by considering any distribution $\phi$ s.t. $p(0)=p(\infty)=\phi(0)=\phi(\infty)$. We compute the variation
$$\frac{\delta H}{\delta\phi}|_{p}=\lim_{\epsilon\rightarrow 0} \frac{H[p+\epsilon\phi]-H[p]}{\epsilon}=\lim_{\epsilon\rightarrow 0}\frac{1}{\epsilon}\left[\int_{0}^{\infty}\left(F(p+\epsilon\phi,x)-F(p,x)\right)dx+ \lambda(\int_{0}^{\infty}\epsilon\phi dx)\right],$$
where $F(p,x)=-p(x)\ln(p(x))$ and $F(p+\epsilon\phi,x)=-(p(x)+\epsilon\phi)\ln(p(x)+\epsilon\phi)$.
Using
$$F(p+\epsilon\phi,x)-F(p,x)=\epsilon\phi\frac{\partial F}{\partial p}(p,x)+O(\epsilon^2)$$
we have
$$\frac{dH}{d\phi}|_{p}=\int_{0}^{\infty}\left(\frac{\partial F}{\partial p}(p,x)+\lambda\right)\phi dx$$
where $\frac{\partial F}{\partial p}(p,x)=-\ln(p(x))-1$. In summary
$$-\ln(p(x))-1+\lambda=0 $$
or $p(x)=e^{\lambda-1}$, with $\int_0^{\infty}e^{\lambda-1}dx=1,$ which is not possible.
Roughly speaking, the absence of additional constraints like the fixed mean one
$$\int_{0}^{\infty} xp(x)dx=\mu$$
does not allow to arrive at "more interesting" differential equations for $p(x)$. Note that the $F$ does not depend on $p'(x)$: this leads to the simplified Euler Lagrange equation
$$\frac{\partial F}{\partial p}(p,x)+\lambda=0.$$