Solved – MLE of $f(x\vert\theta)=1/\theta$, $x_1 , \cdots , x_n \sim U(0,\theta) \;\;, \theta>0$,

likelihoodmaximum likelihooduniform distribution

Original question

$x_1 , \cdots , x_n$ are independent random variables, identically distributed as
a uniform distribution over $(0,\theta)$.

$$
f(x \vert \theta) = \frac{1}{\theta}, \; 0<x<\theta, \;\; \theta >0
$$

What's the Maximum Likelihood Estimator for $\theta$.

Comment on strict inequalities

(from olooney, edited slightly)

The MLE does not exist if we use strict inequality. But
$x_i \sim U(0, \theta)$ has various definitions but makes the intent clear: to model the data as a uniform distribution. For a continuous p.d.f. any finite number of points has measure 0 and can be added or removed and the answer is "almost surely" (jargon for "with probability 1") the same.

Why not use a definition of the U which makes the MLE problem tractable and which is almost surely the same?

Note that the question has strict inequalities, not weak

This is highlighted in this answer

Other similar posts to this

Though I searched others have highlighted other posts of a similar nature :

MLE for Uniform $(0,\theta)$

How do you differentiate the likelihood function for the uniform distribution in finding the M.L.E.?

Best Answer

The p.d.f for one $x_i$ is given as

$$ f(x| \theta) = \begin{cases} \frac{1}{\theta} & & \text{if } 0 \leq x \leq \theta \\ 0 & & \text{otherwise} \end{cases} $$ Let's call $\vec{x} = (x_1, ..., x_n)$.

The $n$ observations are i.i.d. so the likelihood of observing the $n$-vector $\vec{x} = (x_1, ... x_n)$ is the product of the component-wise probabilities. Ignoring the issue of support for the moment, note that this product can be simply written as a power:

$$ f(\vec{x}| \theta) = \prod_i^n \frac{1}{\theta} = \frac{1}{\theta^n} = \theta^{-n} $$

Next, we turn our attention to the support of this function. If any single component is outside its interval of support $(0, 1/\theta)$, then its contribution to this equation is a 0 factor, so the product of the whole will be zero. Therefore $f(\vec{x})$ only has support when all components are inside $(0, 1/\theta)$.

$$ f(\vec{x}| \theta) = \begin{cases} \theta^{-n} & & \text{if } \forall i, \ 0 \leq x_i \leq \theta \\ 0 & & \text{otherwise} \end{cases} $$

By definition, this is also our likelihood:

$$ \mathcal{L}(\theta; \vec{x}) = f(\vec{x}| \theta) = \begin{cases} \theta^{-n} & & \text{if } \forall i, \ 0 \leq x_i \leq \theta \\ 0 & & \text{otherwise} \end{cases} $$

The MLE problem is to maximize $\mathcal{L}$ with respect to $\theta$. But because $\theta > 0$ (given in the title of the problem) then $\theta^{-n} > 0$ therefore 0 will never be the maximum. Thus, this is a constrained optimization problem:

$$ \hat{\theta} = \text{argmin}_\theta \,\, \theta^{-n} \text{ s.t. } \forall i \,\, 0 \leq x_i \leq \theta $$

This is easy to solve as a special case so we don't need to talk about the simplex method but can present a more elementary argument. Let $t = \text{max} \,\, \{x_1,...,x_n\}$. Suppose we have a candidate solution $\theta_1 = t - \epsilon$. Then let $\theta_2 = t - \epsilon/2$. Clearly both $\theta_1$ and $\theta_2$ are on the interior of the feasible region. Furthermore we have $\theta_2 > \theta_1 \implies \theta_2^{-n} < \theta_2^{-n}$. Therefore $\theta_1$ is not at the minimum. We conclude that the minimum cannot be at any interior point and in particular must not be strictly less than $t$. Yet $t$ itself is in the feasible region, so it must be the minimum. Therefore,

$$\hat{\theta} = \text{max} \,\, \{x_1,..., x_n\}$$

is the maximum likelihood estimator.

Note that if any observed $x_i$ is less than 0, then $\mathcal{L}$ is a constant 0 and the optimization problem has no unique solution.

Related Question