[Math] Maximum likelihood estimator on uniform distribution

parameter estimationuniform distribution

I try to be rational and keep my questions as impersonal as I can in order to comply to the community guidelines. But this one is making me mad. Here it goes.
Consider the uniform distribution on $[0, \theta]$. The likelihood function, using a random sample of size $n$ is
$\frac{1}{\theta^{n}}$.
Now $1/\theta^n$ is decreasing in $\theta$ over the range of positive values. Hence it will be maximized by choosing $\theta$ as small as possible while still satisfying $0 \leq x_i \leq \theta$.The textbook says 'That is, we choose $\theta$ equal to $X_{(n)}$, or $Y_n$, the largest order statistic'.But if we want to minimize theta to maximize the likelihood, why we choose the biggest x? Suppose we had real numbers for x like $X_{1} = 2, X_{2} = 4, X_{3} = 8$.If we choose 8, that yields $\frac{1}{8^{3}}=0.001953125$. If we choose $\frac{1}{2^{3}}=0.125$. Therefore why we want the maximum in this case $X_{n}$ and not $X_{1}$, since we`ve just seen with real numbers that the smaller the x the bigger the likelihood? Thanks!

Best Answer

What you are doing is wrong. You must find the likelihood function. What you found is $1/\theta^n$? so where is it defined? It is true that $X_n$ is the maximum likelihood estimator because it maximizes the true likelihood function. How do you find it?

Added: Your answer is actually on the right direction but as I mentioned it is missing a crucial point which alters everything. So the right way of writing down the likelihood funtion is as follows:

\begin{align}L(x_n;\theta)=\prod_{n=1}^N\theta^{-1}\mathbf{1}_{0\leq x_n\leq \theta}(x_n)\\=\theta^{-N}\prod_{i=1}^n\mathbf{1}_{0\leq x_n\leq \theta}(x_n)\end{align}

Until now, $L$ is a function of $x_n$ now lets write it as a function of $\theta$

\begin{align}L(\theta;x_n)=\theta^{-N}\prod_{i=1}^n\mathbf{1}_{\theta \geq x_n}(x_n)\end{align}

Observe that $L(\theta;x_n)$ is zero if $\theta<x_N$ and it is a decreasing positive function of $x$ if $\theta\geq x_N$. We can now see that for any choice $x>x_N$, $L(\theta;x_N)>L(\theta;x)$ this means maximum is reached at $\hat\theta=x_{N}$.