Consider $s_n=x_1+\cdots+x_n$ and $t_n=\max\{x_k\mid1\leqslant k\leqslant n\}$, then
$$
f(x_1,\ldots,x_n\mid\theta)=\mathrm e^{-s_n}(1-\mathrm e^{-\theta})^{-n}\mathbf 1(\theta\geqslant t_n).
$$
Since $2\mathrm e^{-\lambda}=1+\mathrm e^{-\theta}$, this is also
$$
f(x_1,\ldots,x_n\mid\theta)=\mathrm e^{-s_n}2^{-n}(1-\mathrm e^{-\lambda})^{-n}\mathbf 1(\lambda\geqslant\log2-\log(1+\mathrm e^{-t_n})).
$$
Thus, $f(x_1,\ldots,x_n\mid\theta)$ is indeed maximum when $\lambda=\hat\lambda$, with
$$
\hat\lambda=\log2-\log(1+\mathrm e^{-t_n}).
$$
Note that $\hat\lambda=g(\hat\theta)$ where the function $g$ is defined by $\lambda=g(\theta)$.
An example can be given, when we have a misspecification.
Assume that we have an i.i.d. sample of size $n$ of random variables following the Half Normal distribution. The density and moments of this distribution are
$$f_H(x) = \sqrt{2/\pi}\cdot \frac 1{v^{1/2}}\cdot \exp\big\{-\frac {x^2}{2v}\big\}\\ E_H(X) = \sqrt{2/\pi}\cdot v^{1/2}\equiv \mu_x,\;\; \operatorname{Var}_H(X) = \left(1-\frac 2{\pi}\right)v$$
The log-likelihood of the sample is
$$L(v\mid \mathbf x) = n\ln\sqrt{2/\pi}-\frac n2\ln v -\frac {1}{2v}\sum_{i=1}^nx_i^2$$
The first and second derivatives with respect to $v$ are
$$\frac {\partial}{\partial v}L(v\mid \mathbf x) = -\frac n{2v} + \frac {1}{2v^2}\sum_{i=1}^nx_i^2,\;\; \frac {\partial^2}{\partial v^2}L(v\mid \mathbf x) = \frac n{2v^2} - \frac {1}{v^3}\sum_{i=1}^nx_i^2$$
So the Fisher Information for parameter $v$ is
$$\mathcal I(v) = -E\left[\frac {\partial^2}{\partial v^2}L(v\mid \mathbf x)\right] = -\frac n{2v^2} + \frac {1}{v^3}\sum_{i=1}^nE(x_i^2) = -\frac n{2v^2} + \frac {n}{v^3}E(X^2)$$
$$=-\frac n{2v^2} + \frac {n}{v^3}\left[\operatorname{Var}(X)+\left(E[X])^2\right)\right] = -\frac n{2v^2} + \frac {n}{v^3}v$$
$$\Rightarrow \mathcal I(v) = \frac n{2v^2}$$
The Fisher Information for the mean $\mu_x$ is then
$$\mathcal I (\mu_x) = \mathcal I(v) \cdot \left(\frac {\partial \mu_x}{\partial v}\right)^{-2} = \frac n{2v^2}\cdot \left(\sqrt{2/\pi}\frac 12 v^{-1/2}\right)^{-2} = \frac {n\pi}{v}$$
and so the Cramer-Rao lower bound for the mean is
$$CRLB (\mu_x) = \left[\mathcal I (\mu_x)\right]^{-1} = \frac {v}{n\pi}$$
Assume now that we want to estimate the mean using maximum-likelihood, but we make a mistake: we assume that these random variables follow an Exponential distribution with density
$$g(x) = \frac 1{\beta}\cdot \exp\big\{-(1/\beta)x\big\}$$
The mean here is equal to $\beta$, and the maximum likelihood estimator will be
$$\hat \beta_{mMLE} = \hat E(X)_{mMLE} = \frac 1n\sum_{i=1}^nx_i$$
where the lowercase $m$ denotes that this estimator is based on a misspecified density.
Nevertheless, its moments should be calculated based using the true density that the $X$'s actually follow. Then we see that this is an unbiased estimator, since
$$E_H[\hat E(X)_{mMLE}] = \frac 1n\sum_{i=1}^nE_H[x_i] = E_H(X) = \mu_x$$
while its variance is
$$\operatorname{Var}(\hat E(X)_{mMLE}) = \frac 1n\operatorname{Var}_H(X) = \frac 1n\left(1-\frac 2{\pi}\right)v$$
This variance is greater than the Cramer-Rao lower bound for the mean because
$$ \operatorname{Var}(\hat E(X)_{mMLE}) = \frac 1n\left(1-\frac 2{\pi}\right)v > CRLB (\mu_x) = \frac {v}{n\pi} $$
$$\Rightarrow 1-\frac 2{\pi} > \frac {1}{\pi} \Rightarrow 1 > \frac 3{\pi}$$
which holds. So we have an MLE which is unbiased but does not attain the Cramer-Rao lower bound for the magnitude that it estimates. Its efficiency is
$$\frac {CRLB (\mu_x)}{\operatorname{Var}(\hat E(X)_{mMLE})} = \frac {\frac {v}{n\pi}}{\frac 1n\left(1-\frac 2{\pi}\right)v} = \frac 1{\pi - 2} \approx 0.876$$
Note that the MLE for the mean under the correct specification is biased, with a downward bias.
Best Answer
It should be intuitively obvious that such an estimator is necessarily biased, because it can never be smaller than the true value of $\theta$. If it were, then you would observe $$\hat\theta_{\text{MLE}} = X_{1:n} = \min_i X_i \le \theta,$$ which is absurd. So if there is a nonzero probability that the MLE is greater than $\theta$ (which of course is the case), it must be biased since $\Pr[\hat \theta_{\text{MLE}} < \theta] = 0.$