[Math] Why is Beta the maximum entropy distribution over Bernoulli’s parameter

pr.probabilityst.statistics

Why is Beta(1,1) the maximum entropy distribution over the bias of a coin expressed as a probability given that:

  • If we express the bias as odds (which is over the support $[0, \infty)$), then Beta-prime(1,1) is the corresponding distribution to Beta(1,1). Isn't the maximum entropy distribution over the positive reals the exponential distribution (which is not Beta-prime(1,1))?

  • If we express the bias in log odds (which is over the support of the reals), then the logistic distribution (with mean 0 and scale 1) is the corresponding distribution to Beta(1,1).

Beta(1,1) makes sense as maximum entropy because it's flat over its support. The other distributions are not flat. If we had chosen a different parametrization, we should clearly arrive at the corresponding distribution (not something else). How are the other two distributions the maximum entropy distributions over their support? There must be some other requirement that I'm missing. What is it?

Best Answer

I think there are two separate things going on here. One is the issue of a maximum entropy distribution. The other is of whether or not distributions are invariant under different parameterizations. Regarding the second matter, I think your statement "if we had chosen a different parameterization, we should clearly arrive at the corresponding distribution" is probably not quite right (I say probably because I may be interpreting you wrong). Only particular distributions have this property and sometimes are not probability distributions. See http://en.wikipedia.org/wiki/Jeffreys_prior if this is what you're interested in.

ps I'd have preferred to leave this as a comment, but can't yet I guess.

Related Question