In a reply concerning interpreting probability densities I have argued for the merits of including the differential terms ($dx$ and $dy$ in this case) in the PDF, so let's write
$$f(x,y; \theta,\lambda) = \frac{1}{\lambda x \sqrt{2\pi}} \exp\left(-\frac{(y-x\theta)^2}{2x^2} - \frac{x}{\lambda}\right)\,dx\,dy.$$
In (d), attention is focused on $\hat{\theta}$ which appears to be a multiple of $\sum_i y_i/x_i$ where each $(x_i,y_i)$ is an independent realization of the bivariate random variable described by $f$. To tackle this, let's consider the distribution of $Z=Y/X$--and then we will later have to sum a sequence of independent versions of this variable in order to determine the distribution of $\hat\theta$.
In contemplating the change of variable $(X,Y)\to (X,Z) = (X,Y/X)$, we recognize that the non-negativity of $X$ assures this induces an order-preserving, one-to-one correspondence between $Y$ and $Z$ for each $X$. Therefore all we need to do is change the variable in the integrand, writing $y=x z$:
$$f(x,x z; \theta,\lambda) = \frac{1}{\lambda x \sqrt{2\pi}} \exp\left(-\frac{(x z-x\theta)^2}{2x^2} - \frac{x}{\lambda}\right)\,dx\,d(x z).$$
From the product rule $d(x z) = z dx + x dz$, and understanding the combination $dx\, d(x z)$ in the sense of differential forms $dx \wedge d(x z)$, we mechanically obtain
$$dx\, d(x z) = dx \wedge d(x z) = dx \wedge (z dx + x dz) = z dx \wedge dx + x dx \wedge dz = x\,dx\,dz.$$
(This is the easy way to obtain the Jacobian determinant of the transformation.)
Therefore the distribution of $(X,Z)$ has PDF
$$\eqalign{
g(x,z;\theta,\lambda) = f(x, x z; \theta, \lambda) &= \frac{1}{\lambda x \sqrt{2\pi}} \exp\left(-\frac{(x z-x\theta)^2}{2x^2} - \frac{x}{\lambda}\right)\,x dx\,dz \\
&= \frac{1}{\lambda} \exp\left( - \frac{x}{\lambda}\right)\frac{1}{\sqrt{2\pi}}\,dx\ \exp\left(-\frac{( z-\theta)^2}{2}\right) \,dz.
}$$
Because this has separated into a product of a PDF for $X$ and a PDF for $Z$, we see without any further effort that (1) $X$ and $Z$ are independent and (2) $Z$ has a Normal$(\theta,1)$ distribution.
Finding the joint distribution of $(\hat{\lambda}, \hat\theta)$, which are easily expressed in terms of $X$ and $Z$, is now straightforward.
Your question is easy to answer if you are not too serious about $\theta_i\in[0,1]$. Is $\theta_i\in(0,1)$ good enough? Let's say it is. Then, instead of maximizing the likelihood function $L(\theta)$ in $\theta$, you are going to do a change of variables, and instead you maximize the likelihood function $L(\alpha)=L(\theta(\alpha))$ in $\alpha$.
What's $\theta(\alpha)$, you ask? Well, if $\theta$ is a $K$ dimensional vector, then we let $\alpha$ be a $(K-1)$ dimensional vector and set:
\begin{align}
\theta_1 &= \frac{\exp(\alpha_1)}{1+\sum exp(\alpha_k)} \\
\theta_2 &= \frac{\exp(\alpha_2)}{1+\sum exp(\alpha_k)} \\
&\vdots\\
\theta_{K-1} &= \frac{\exp(\alpha_{K-1})}{1+\sum exp(\alpha_k)} \\
\theta_K &= \frac{1}{1+\sum exp(\alpha_k)} \\
\end{align}
After you substitute $\alpha$ into your likelihood function, you can maximize it unconstrained. The $\alpha$ can be any real number. The $\theta(\alpha)$ function magically imposes all your constraints on $\theta$. So, now the usual theorems proving consistency and aymptotic normality of the MLE follow.
What about $\theta$, though? Well, after you have estimated the $\alpha$, you just substitute them into the formulas above to get your estimator for $\theta$. What is the distribution of $\theta$? It is asymptotically normal with mean $\theta_0$, the true value of $\theta$, and variance $V(\hat{\theta})=\frac{\partial \theta}{\partial \alpha}' \ V(\hat{\alpha}) \frac{\partial \theta}{\partial \alpha}$.
As you say, $V(\hat{\theta})$ won't be full rank. Obviously, it can't be full rank. Why not? Because we know the variance of $\sum \hat{\theta}_i$ has to be zero---this sum is always 1, so its variance must be zero. A non-invertible variance matrix is not a problem, however, unless you are using it for some purpose it can't be used for (say to test the null hypothesis that $\sum \theta_i = 1$). If you are trying to do that, then the error message telling you that you can't divide by zero is an excellent warning that you are doing something silly.
What if you are serious about including the endpoints of your interval? Well, that's much harder. What I would suggest is that you think about whether you are really serious. For example, if the $\theta_i$ are probabilities (and that's what your constraints make me think they are), then you really should not be expecting the usual maximum likelihood procedures to give you correct standard errors.
For example, if $\theta_1$ is the probability of heads and $\theta_2$ is the probability of tails, and your dataset looks like ten heads in a row, then the maximum likelihood estimate is $\hat{\theta}_1=1$ and $\hat{\theta}_2=0$. What's the variance of the maximum likelihood estimator evaluated at this estimate? Zero.
If you want to test the null hypothesis that $\theta_1=0.5$, what do you do? You sure don't do this: "Reject null if $\left|\frac{\hat{\theta}_1-0.5}{\sqrt{\hat{V}(\hat{\theta}_1)}}\right|>1.96$" Instead, you calculate the probability that you get ten heads in a row with a fair coin. If that probability is lower than whatever significance level you picked, then you reject.
Best Answer
In general, but not always, what will happen is that the constrained MLE will be the closest possible value to the unconstrained MLE. To re-use your example, if $\sum x_i / n = 0.7$ but $0 < \theta \leq 0.5$, then the unconstrained MLE for $\theta$ is $0.7$ but the constrained MLE for $\theta$ is 0.5. This will always be the case if the log likelihood is concave in the parameter of interest.
Note that if you have a strict inequality constraint, for example, $0 < \theta < 0.5$, the constrained MLE doesn't exist, as there is no largest number less than $0.5$.
In some cases, though, the likelihood function can be multimodal. (The Student-t distribution is an example of this.) The unconstrained MLE corresponds to whichever mode is the highest (or, in the case of ties, is any value from the set corresponding to the highest modes.) In these cases, it may be that the constraint excludes the highest mode, so that the constrained MLE becomes EITHER the value corresponding to the highest mode remaining in the interior of the set of feasible values OR a value on the boundary of the set of feasible values - which latter may not appear to be a mode when looking at the unconstrained problem.
The following graph shows an example of this. The unconstrained MLE is at $6.4$, but, if we add a constraint at the red line ($5.5$), the new MLE is at $5.5$, despite the presence of a mode at $0.7$.