Usually to find MLE's for density functions where the domain of the density depends on the parameter is not straightforward in the sense that differentating will not help.
Denote by $\textbf{1}_{[a,b]}(x)$ the indicator function meaning that
$$\textbf{1}_{[a,b]}(x)= \begin{cases} 1 \mbox{ if } x\in [a,b] \\ 0 \mbox{ if not.}\end{cases}$$
The density function of a uniform random variable $U(0,\theta)$ is given by
$$f_{\theta}(x) = \frac{1}{\theta} \textbf{1}_{[0,1]}(\theta).$$
Given a sample of $n$ random observations: $Y_1,\dots, Y_n$ (it is convenient to write observations with small letters and random variables with big letters, so when I write big $Y$'s means that these are computations before we collect data) the likelihood funtcion is given by:
$$L(\theta|Y_1,\dots, Y_n)= \prod_{i=1}^n f_{\theta}(Y_i) = \frac{1}{\theta^n}\prod_{i=1}^n \textbf{1}_{[0,\theta]}(Y_i).$$
Now look at the product of indicator functions. We should try to write this as a function of $\theta$ instead of as a function of all the $Y_i$'s in order to know exactly what is the domain of $L$. This product is not zero only if $0<Y_i<\theta$ for all the $i=1,\dots,n$. In other words, $0<\min_i Y_i < \max_i Y_i < \theta$. So the domain of the likelihood function is only for $\theta>\max_{i=1,\dots,n} Y_i$. That is
$$L(\theta|Y_1,\dots, Y_n)= \frac{1}{\theta^n} \textbf{1}_{[\max_{i=1,\dots,n} Y_i, \infty)}(\theta).$$
Now, the function $1/\theta^n$ function is strictly decreasing on $[0,\infty]$. So its maximum must be attained at the left value. In conclusion, the MLE of $\theta$ is given by the maximum values of the sample of $Y_i$'s which is quite intuitive:
$$\hat{\theta}_{MLE} = \max_{i=1,\dots,n} Y_i.$$
Now try to reproduce the same computations and ideas when we have a uniform distribution depending on two parameters: $f_{\theta_1, \theta_2} (x) = \frac{1}{\theta_2- \theta_1}$ for $\theta_1 < y < \theta_2$. Intuition tells that if you have a sample of $Y_1,\dots, Y_n$ then the MLE's of $\theta_1$ and $\theta_2$ should be:
$$ \hat{\theta_1} = \min_{n=1,\dots,n} Y_i \mbox{ and } \hat{\theta_2} = \max_{n=1,\dots,n} Y_i.$$
Observe that the estimators are expressed with capital letters. So they are random. Each time you collect a new sample you get different estimates. Hence when you collect a sample of observations $y_1,\dots, y_n$ (small letters) the estimates are:
$$ \hat{\theta_1} = \min_{n=1,\dots,n} y_i \mbox{ and } \hat{\theta_2} = \max_{n=1,\dots,n} y_i.$$
(Just substitute the values you got in the exercise)
I hope this helped! ;)
Let $\delta := \theta_2 - \theta_1$ and $Z_i := \theta_1 + X_i$. We have $Z_i \sim \text{Unif}(0,\delta)$, and $X_{(1)} = \theta_1 + Z_{(1)}$ and $X_{(n)} = \theta_1 + Z_{(n)}$. Then, $$R := X_{(n)} - X_{(1)} = Z_{(n)} - Z_{(1)}$$ is a good estimate of $\delta$.
We note that $[Z_{(n)} - Z_{(1)}]/\delta$ is a pivot, i.e. its distribution does not depend on $\delta$ and it is easy to find out. This can be used to construct the CI.
Hint: Note that $U_i := Z_i/\delta$ is uniform $(0,1)$. You are basically computing the distribution of $U_{(n)} - U_{(1)}$. You can either use the known joint distribution of $(U_{(1)},U_{(n)})$ or do it directly yourself:
\begin{align}
\mathbb P(U_{(n)} - U_{(1)} \le t) &= \mathbb P(U_{(n)} \le t + U_{(1)}) \\
&=\mathbb \int_0^1 \mathbb P[U_{(n)} \le t + u, \; U_{(1)} \in (u,u+du)]
\end{align}
The integrand is the probability that the minimum is about $u$ and the rest of them are $\in(u, t+u)$. There are $n$ possible choices for which one of the variables is the minimum, and all these events have the same probability $ du (\min\{t+u,1\}-u)^{n-1}$.
Hence,
$$
p(t) := \mathbb P(U_{(n)} - U_{(1)} \le t) = n\int_0^1 (\min\{t+u,1\}-u)^{n-1} du.
$$
which can be simplified and explictly computed $=n(1-t)t^{n-1} + t^n$ for $t \in [0,1]$.
Choose $t_1$ and $t_2$ such that $p(t_2)-p(t_1) = 0.95$. Then $[R/t_2,R/t_1]$ is a 95% CI for $\delta$. You can try to minimize the length, by minimizing $|1/t_2 - 1/t_1|$ subject to the given constraint.
Some more details: Let $A := \{U_{(n)} \le t + u, \; U_{(1)} \in (u,u+du)\}$. We can write (assume $t > du$ for this to formally be true)
\begin{align*}
A &= \bigcup_{i=1}^n \{U_{(n)} \le t + u, \; U_{(1)} \in (u,u+du), U_{(1)} = i\} \\
&= \bigcup_{i=1}^n \{ U_j \le t + u,\, \forall j\neq i, \;\; U_{i} \in (u,u+du), \;\;U_{(1)} = i\} \\
&=\bigcup_{i=1}^n \{ U_i < U_j \le t + u,\, \forall j\neq i, \;\; U_{i} \in (u,u+du)\} \\
&\approx \bigcup_{i=1}^n \underbrace{\{ u+ du < U_j \le t + u,\, \forall j\neq i, \;\; U_{i} \in (u,u+du)\}}_{A_i}
\end{align*}
The last one is really $\supset$ in place of $\approx$, but one somehow argues that as $du \to 0$, it approaches the desired set. The sets $A_i$ are disjoint (for example on $A_1$, $U_1$ is in $(u,u+du)$ while it is in $(u+du,t+u)$ in all other $A_j, j\neq 1$), hence $\mathbb P(A) = \sum_{i=1}^n \mathbb P(A_i)$. By symmetry $P(A_i)$ are the same for all $i$, hence $ \mathbb P(A) = n \mathbb P(A_1)$, and
$$
\mathbb P(A_1) = \Big[\prod_{j=2}^n \mathbb P( u + du < U_j \le t+u) \Big]\mathbb P( U_1 \in (u,u+du))
$$
by indepedence.
Best Answer
This is not the usual parameterization of a uniform distribution, so looking at standard references may not be immediately helpful,
Begin by showing That $E(X) = \theta_1$ and $Var(X) = \theta_2^2/3.$
Then, for the given sample, find $\bar X = 0.375$ and $S^2 = 6.3825.$ Computations in R:
Finally, set $E(X) = \mu = \theta_1 = \bar X$ and $Var(X) = \sigma^2 = \theta_2^2/3 = S^2.$ Solve for $\theta_1, \theta_2$ in terms of data to get numerical MOM estimates $\hat\theta_1, \hat\theta_2,$ respectively. Obviously, $\hat\theta_1 - \bar X =5.375.$ What is $\hat\theta_2 ?$
Note: Later you may study maximum likelihood estimates. Sometimes MOM (method-of-moments) estimators are the same as MLEs (maximum likelihood estimators), and sometimes not. When sampling from the uniform distribution of this problem, there are differences between the two kinds of estimators.