Uniform Distribution Parameter Estimation – Improper Priors in Bayesian Analysis

bayesianestimationuniform distributionuninformative-prior

We have N samples, $X_i$, from a uniform distribution $[0,\theta]$ where $\theta$ is unknown. Estimate $\theta$ from the data.

So, Bayes' rule…

$f(\theta | {X_i}) = \frac{f({X_i}|\theta)f(\theta)}{f({X_i})}$

and the likelihood is:

$f({X_i}|\theta) = \prod_{i=1}^N \frac{1}{\theta}$
(edit: when $0 \le X_i \le \theta$ for all $i$, and 0 otherwise — thanks whuber)

but with no other information about $\theta$, it seems like the prior should be proportional to $1$ (i.e. uniform) or to $\frac{1}{L}$ (Jeffreys prior?) on $[0,\infty]$ but then my integrals don't converge, and I'm not sure how to proceed. Any ideas?

Best Answer

This has generated some interesting debate, but note that it really doesn't make much difference to the question of interest. Personally I think that because $\theta$ is a scale parameter, the transformation group argument is appropriate, leading to a prior of

$$\begin{array}& p(\theta|I)=\frac{\theta^{-1}}{\log\left(\frac{U}{L}\right)}\propto\theta^{-1} & L<\theta<U\end{array}$$

This distribution has the same form under rescaling of the problem (the likelihood also remains "invariant" under rescaling). The kernel of this prior, $f(y)=y^{-1}$ can be derived by solving the functional equation $af(ay)=f(y)$. The values $L,U$ depend on the problem, and really only matter if the sample size is very small (like 1 or 2). The posterior is a truncated pareto, given by:

$$\begin{array}\\ p(\theta|DI)=\frac{N\theta^{-N-1}}{ (L^{*})^{-N}-U^{-N}} & L^{*}<\theta<U & \text{where} & L^{*}=max(L,X_{(N)}) \end{array}$$ Where $X_{(N)}$ is the Nth order statistic, or the maximum value of the sample. We get the posterior mean of $$E(\theta|DI)= \frac{ N((L^{*})^{1-N}-U^{1-N}) }{ (N-1)((L^{*})^{-N}-U^{-N}) }=\frac{N}{N-1}L^{*}\left(\frac{ 1-\left[\frac{L^{*}}{U}\right]^{N-1} }{ 1-\left[\frac{L^{*}}{U}\right]^{N} }\right)$$ If we set $U\to\infty$ and $L\to 0$ the we get the simpler exression $E(\theta|DI)=\frac{N}{N-1}X_{(N)}$.

But now suppose we use a more general prior, given by $p(\theta|cI)\propto\theta^{-c-1}$ (note that we keep the limits $L,U$ to ensure everything is proper - no singular maths then). The posterior is then the same as above, but with $N$ replaced by $c+N$ - provided that $c+N\geq 0$. Repeating the above calculations, we the simplified posterior mean of

$$E(\theta|DI)=\frac{N+c}{N+c-1}X_{(N)}$$

So the uniform prior ($c=-1$) will give an estimate of $\frac{N-1}{N-2}X_{(N)}$ provided that $N\geq 2$ (mean is infinite for $N=2$). This shows that the debate here is a bit like whether or not to use $N$ or $N-1$ as the divisor in the variance estimate.

One argument against the use of the improper uniform prior in this case is that the posterior is improper when $N=1$, as it is proportional to $\theta^{-1}$. But this only matters if $N=1$ or is very small.