Solved – Maximum Likelihood Estimator of the exponential function parameter based on Order Statistics

estimationexponential distributionmaximum likelihoodorder-statisticsself-study

The following question is part (1/4) of a 2.30h written exam for the course "Probability and Statistics" in a school of engineering. So, although tricky and difficult (because the Professor is really demanding from his students), it should be solvable in a logical amount of time and with a logical amount of calculations.

Let $X_1, \ldots, X_n$ be a random sample (i.i.d. r.v.) from the exponential distribution $\exp(\lambda)$, where $\lambda$ is unknown. Let $M_n=\max\{X_1, \ldots, X_n\}$ with probability distribution function $$G(x)=(1-e^{-\lambda x})^{n}, \qquad x>0$$
and zero elsewhere.

Q1. Find the probability density function of $M_n$.

Q2. If $M_n$ is the only information that you have for $X_1,X_2,\ldots,X_n$, find the maximum likelihood estimator (MLE) $\hat{\lambda}_n$ of $\lambda$.

Q3. Using $(1+x)^n>1+nx$ (or any other way) prove that $\hat{\lambda}_n$ is consistent, i.e. that $P(| \hat{\lambda}_n-\lambda|>\epsilon)\longrightarrow0$, for $n\rightarrow \infty$

For Q1, I took the derivative of the cdf of $M_n$ which I found to be equal to $$g(x)=n\lambda e^{-\lambda x}(1-e^{-\lambda x})^{n-1}$$ (doublechecked with Wolfram|Alpha).

For Q2, I thought that the function I should maximize (with respect to $\lambda$) is $g(x)$ because that is my single observation from the sample of size $n$. If I understand the exercise correctly someone takes a sample of $n$ observations $X_1,X_2,\ldots X_n$ and tells me only their maximum $M_n$. Now, from this single information I have to calculate a MLE for $\lambda$. So, I will maximize the pdf of $M_n$ which is know my likelihood function, no? Is my mistake here?

However, if I took as $$L(x;\lambda)=g(x)$$ and $$l(x;\lambda)=\ln\left(L(x;\lambda)\right)=\ln\left(g(x)\right)=\ln(n)+\ln(\lambda)-\lambda x+(n-1)\ln(1-e^{-\lambda x})$$ Then, as usually, I calculated the derivative of $l(x;\lambda)$ and set it equal to $0$ $$\frac{d}{d\lambda}l(x;\lambda)=\frac{1}{\lambda}-x+(n-1)\frac{xe^{-\lambda x}}{1-e^{-\lambda x}}=0$$ which reduces to $$e^t=\frac{1-nt}{1-t}$$ where $t=\lambda x$. But I cannot solve this equation (called transcendental as someone told me).

Best Answer

Since you are a tutor, any knowledge is always for a good cause. So I will provide some bounds for the MLE.

We have arrived at

$$(1-\lambda x_{(n)})e^{\lambda x_{(n)} } + \lambda n x_{(n)} - 1 = 0$$ with $x_{(n)}\equiv M_n$. So

$$(1-\hat \lambda x_{(n)})e^{\hat \lambda x_{(n)}} = 1-\hat \lambda x_{(n)}n $$ Assume first that $1-\hat \lambda x_{(n)} >0$. Then we must also have $1-\hat \lambda x_{(n)}n>0$ since the exponential is always positive. Moreover since $x_{(n)}, \hat \lambda > 0\Rightarrow e^{\hat \lambda x_{(n)}}>1$. Therefore we should have

$$\frac {1-\hat \lambda x_{(n)}n}{1-\hat \lambda x_{(n)}}>1 \Rightarrow \hat \lambda x_{(n)}>\hat \lambda x_{(n)}n$$ which is impossible. Therefore we conclude that

$$\hat \lambda >\frac 1{x_{(n)}},\;\; \hat \lambda = \frac c{x_{(n)}}, \;\; c>1$$

Inserting into the log-likelihood we get

$$\ell(\hat\lambda(c)\mid x_{(n)}) = \log \frac c{x_{(n)}} + \log n - \frac c{x_{(n)}} x_{(n)} + (n-1) \log (1 - e^{-\frac c{x_{(n)}} x_{(n)}})$$

$$= \log \frac n{x_{(n)}} + \log c - c + (n-1) \log (1 - e^{-c})$$

We want to maximize this likelihood with respect to $c$. Its 1st derivative is

$$\frac{d\ell}{dc}=\frac 1c -1 +(n-1)\frac 1{e^{c}-1}$$

Setting this equal to zero, we require that

$$e^{c}-1 - c\left(e^{c}-1\right)+(n-1)c =0$$

$$\Rightarrow \left(n-e^c\right)c = 1-e^c$$

Since $c>1$ the RHS is negative. Therefore we must also have $n-e^c <0 \Rightarrow c > \ln n$. For $n\ge 3$ this provides a tighter lower bound for the MLE, but it doesn't cover the $n=2$ case, so

$$\hat \lambda > \max \left\{\frac 1{x_{(n)}}, \frac {\ln n}{x_{(n)}}\right\}$$

Moreover (for $n\ge 3$) rearranging the 1st-order condition we have that

$$c= \frac{e^c-1}{e^c-n} > \ln n \Rightarrow e^c -1 > e^c\ln n -n\ln n $$

$$\Rightarrow n\ln n-1>e^c(\ln n -1) \Rightarrow c< \ln{\left[\frac{n\ln n-1}{\ln n -1}\right]}$$ So for $n\ge 3$ we have that

$$\frac 1{x_{(n)}}\ln n < \hat \lambda < \frac 1{x_{(n)}}\ln{\left[\frac{n\ln n-1}{\ln n -1}\right]}$$

This is a narrow interval, especially if $x_{(n)}\ge 1$. For example (truncated at 3d digit )

$$\begin{align} n=10 & &\frac 1{x_{(n)}}2.302 < \hat \lambda < \frac 1{x_{(n)}}2.827\\ n=100 & & \frac 1{x_{(n)}}4.605 < \hat \lambda < \frac 1{x_{(n)}}4.847\\ n=1000 & & \frac 1{x_{(n)}}6.907 < \hat \lambda < \frac 1{x_{(n)}}7.063\\ n=10000 & & \frac 1{x_{(n)}}9.210< \hat \lambda < \frac 1{x_{(n)}}9.325\\ \end{align}$$

Numerical examples indicate that the MLE tends to be equal to the upper bound, up to second decimal digit.

ADDENDUM: A CLOSED FORM EXPRESSION
This is just an approximate solution (it only approximately maximizes the likelihood), but here it is:
manipulating the 1st-order condition we want to have

$$\lambda = \frac 1{x_{(n)}}\ln \left[\frac {\lambda x_{(n)}n -1}{\lambda x_{(n)} -1}\right]$$

Now, one can show (see for example here) that

$$E[X_{(n)}] = \frac {H_n}{\lambda},\;\; H_n = \sum_{k=1}^n\frac 1k$$

Solving for $\lambda$ and inserting into the RHS of the implicit 1st-order condition, we obtain

$$\lambda = \frac 1{x_{(n)}}\ln \left[\frac {nH_n\frac {x_{(n)}}{E[X_{(n)}]} -1}{ H_n\frac {x_{(n)}}{E[X_{(n)}]} -1}\right]$$

We want an estimate of $\lambda$, given that $X_{(n)}=x_{(n)}$, $\hat \lambda \mid \{X_{(n)}=x_{(n)}\}$. But in such a case, we also have $E[X_{(n)}\mid \{X_{(n)}=x_{(n)}\}] =x_{(n)}$. this simplifies the expression and we obtain

$$\hat \lambda = \frac 1{x_{(n)}}\ln \left[\frac {nH_n -1}{ H_n -1}\right]$$

One can verify that this closed form expression stays close to the upper bound derived previously, but a bit less than the actual (numerically obtained) MLE.