Probability – Maximum Entropy Distribution with Given Mean, Variance, and Skewness

entropymaximum-principleprobability distributions

Is there a way to find the maximum entropy distribution with certain values for the three first moments, when the support is the set of real numbers? Or, without loss of generality, with mean zero, unit variance and skewness $\gamma$? How can I do that?

Best Answer

When the support $S=\mathbb{R}$, there is no maximum entropy distribution with mean $\mu\in\mathbb{R}$, variance $\sigma^2\in\mathbb{R}_{>0}$ and skewness $\gamma\in\mathbb{R}_{\neq 0}$. You can always find distributions that satisfy those constraints, but the set of different entropies that they can take on is open, and therefore doesn't have a maximum value.

For $S=[-a, +a]$ and for some sufficiently large $a\in\mathbb{R}$, there is a maximum entropy distribution $\mathcal{P}(a,\mu,\sigma^2,\gamma)$ with mean $\mu$, variance $\sigma^2$ and skewness $\gamma$, and it will have a probability density function on the form $f(x) = Z^{-1}e^{\lambda_1 x^1 + \lambda_2 x^2 + \lambda_3 x^3}$. Assuming that $\gamma\neq 0$, this distribution will have a strictly smaller entropy than the maximum entropy distribution with the same mean and variance but with zero skewness, which is $\mathcal{N}(\mu,\sigma^2)$, and for $a - |\mu| \gg \sigma$, $\mathcal{P}(a,\mu,\sigma^2,\gamma)$ will look practically like $\mathcal{N}(\mu,\sigma^2)$, with a small amount of its probability mass taken and pressed against one of the interval endpoints. And in the limit as $a\to\infty$, $\mathcal{P}(a,\mu,\sigma^2,\gamma)$ will approach $\mathcal{N}(\mu,\sigma^2)$ and therefore its entropy will approach that of $\mathcal{N}(\mu,\sigma^2)$.

A probability distribution with a small amount of its probability mass taken and moved far away from the distribution center in order to give rise to a skewness was not was I had in mind, though. (Bad distribution!) Considering the fact that even though the skewness of this distribution has the value that was specified, but higher order moments will approach either $\infty$ or $-\infty$ as $a\to\infty$, it becomes apparent that to get something that is more reasonable, some of the higher order moments need to be constrained as well. Constraining the kurtosis (the fourth standardized moment) to a maximum value makes the problem solvable when $S=\mathbb{R}$ and introduced an $x^4$ term with a negative coefficient in the exponent, which keeps the probability density function bounded and makes all higher order moments well defined as well.

Related Question