Max. likelihood and sufficient statistic of exponential distribution.

maximum likelihoodprobabilitysolution-verificationstatisticssufficient-statistics

Consider the following probability function of a random variable $Y$:
$$
f(y \mid \theta)=e^{-(y-\theta)},\quad y\ge\theta
$$

and $0$ otherwise. We take a random sample $(Y_1,Y_2,…,Y_k)$ and want to find a sufficient statistic and a maximum likelihood estimator for $\theta$.

Now, the likelihood is given by
$$
L\left(y_1, y_2, \ldots, y_k \mid \theta\right)=\prod_{i=1}^k e^{-\left(y_i-\theta\right)}=\exp \left(-\sum_{i=1}^k y_i+k \theta\right)
$$

Obviously, this is maximized when $\theta$ is maximized. Since the density function is nonzero only when $y\ge\theta$, my first intuition is that the MLE for $\theta$ is $\min(y_1,y_2,…,y_k)$, although I am not sure that it is correct.

For the sufficient statistic, I believe we can choose $S=-\sum_{i=1}^k Y_i$, in which case the likelihood function can be written as the product of $g(s, \theta)=e^{s+k \theta}$ and $h(y_1,y_2,…,y_k)=1$, and a theorem then tells us that $S$ is a sufficient statistic.

Can someone tell me if I have made a mistake or misunderstood something?

Best Answer

Your argument for the maximum likelihood estimator is fine, since the likelihood is $$e^{k\theta -\sum_i y_i} \mathbf{1}_{\theta \le \min_i y_i}.$$

As I mentioned in a comment, your MLE $\min_i y_i$ (actually, any estimator) should be a function of any sufficient statistic (so, contrary to your comment, the MLE and sufficient statistics are definitely related). This is a fundamental property of sufficient statistics. If you don't believe me, see this excerpt from Wikipedia:

A sufficient statistic is a function of the data whose value contains all the information needed to compute any estimate of the parameter (e.g. a maximum likelihood estimate).

Since $\min_i y_i$ is not a function of $\sum_i y_i$, we see that $\sum_i y_i$ is not a sufficient statistic. This is a good lesson to always encode the support of densities with an indicator function (as I have above) before doing further operations like maximizing the likelihood or applying the Fisher-Neyman factorization theorem. With the indicator function, you can see that the factorization $$e^{k\theta} \mathbf{1}_{\theta \le \min_i y_i} \cdot e^{-\sum_i y_i} = g(\min_i y_i, \theta) \cdot h(y_1, \ldots, y_n)$$ shows that $\min_i y_i$ is a sufficient statistic.