[Math] Maximum likelihood estimator of minimum function with exponential RV and a random number

probabilityprobability distributionsstatistics

I'm having some problems with the following assignment:

Let $X_1, X_2, …,, X_n$ be samples from a exponential distribution with parameter $\lambda$, and let $c_1, c_2, …, c_n$ be a sequence of positive numbers. Define
\begin{align*}
Y_i = \min(X_i, c_i) \quad \text{and} \quad \Delta_i = \textbf{1} \{ Y_i = X_i \}.
\end{align*}

Determine the likelihood of the observed data $(Y_1, \Delta_1), (Y_2, \Delta_2), ….,(Y_n, \Delta_n)$.

My main problems are that I do not know how to calculate the pdf of the $Y_i$, and even if knowing those pdf's, I'm not sure how to use the $\Delta_i$ in the correct way.

What I've done so far:

I've first followed this post for calculating the pdf of $Y_i$: How to find the pdf of [min(RV.1,RV.2)]/RV.2

So, first notice that $Y_i$ is a random variable supported on $[ 0, c_i]$.

For $0 \leq y < c_i$ we see:
\begin{align*}
P(Y_i \leq y) = P(\min(X_i, c_i) \leq y) = P(X_i \leq y) = 1 – e^{-\lambda \; y}
\end{align*}

For $y = c_i$ we see:
\begin{align*}
P(Y_i = y) = P(\min(X_i, c_i) = y) = P(X_i > y) = e^{- \lambda \; c_i}
\end{align*}

Let $F_{Y_i}$ denote the distribution function of $Y_i$, then we see:
\begin{align*}
F_{Y_i}(y) &= 1 – e^{-\lambda \; y} &0 \leq y < c_i \\
F_{Y_i}(y) &= 1 &y = c_i
\end{align*}
and
\begin{align*}
f_{Y_i}(y) = \lambda e^{- \lambda y}, \quad 0 \leq y < c_i
\end{align*}
which obviously does not integrate to 1.

Disregarding that for the moment, it seems logical to me that the likelihoodfunction $\mathcal{L}(\lambda)$ then becomes
\begin{align*}
\mathcal{L}(\lambda) = \prod_{i = 1}^n \left( \Delta_i \lambda e^{- \lambda \; Y_i} + (1 – \Delta_i) e^{- \lambda \; c_i} \right).
\end{align*}

Logging this statement we get
\begin{align*}
\log \mathcal{L}(\lambda) = \sum_{i = 1}^n \log\left( \Delta_i \lambda e^{- \lambda \; Y_i} + (1 – \Delta_i) e^{- \lambda \; c_i} \right)
\end{align*}
and calculating the derivative gives us
\begin{align*}
\frac{ \mathrm{d} \log \mathcal{L}(\lambda)}{ \mathrm{d} \lambda} = \sum_{i = 1}^n \frac{\Delta_i e^{\lambda \; c_i}( \lambda Y_i – 1) – c_i (\Delta_i – 1)e^{Y_i \; \lambda} }{(\Delta_i – 1)\lambda e^{\lambda \; Y_i} – \Delta_i \lambda e^{\lambda \; c_i}}.
\end{align*}

But as far as I'm concerned, this brings us nowhere.

If someone could point out where I went wrong, I'd really appreciate it. Thanks in advance for any replies!

Best Answer

The approach is correct. Contrary to the title of the question, the $c$'s are designated as constants. So $Y_i$ has the distribution of $X_i$ but with a ceiling, and each time we hit the ceiling, the probability is allocated to $c_i$. So the distribution function is indeed

$$\begin{align*} F_{Y_i}(y) &= 1 - e^{-\lambda y} &0 \leq y < c_i \\ F_{Y_i}(y) &= 1 &y = c_i \end{align*}$$

The density is likewise step-wise, i.e.

$$\begin{align*} f_{Y_i}(y) &= \lambda e^{-\lambda y} &0 \leq y < c_i \\ f_{Y_i}(y)=1-F_{Y_i}(y\mid y<c_i) &= e^{- \lambda y} &y = c_i \end{align*}$$

which "integrates to unity" alright since (skipping formalities)

$$\int_{S_{Y_i}}f_{Y_i}(y)dy = \int_0^{c_i}\lambda e^{-\lambda y}dy + e^{- \lambda c_i} = - e^{-\lambda y} \Big |_0^{c_i}+ e^{- \lambda c_i} = -e^{- \lambda c_i} +1 +e^{- \lambda c_i} =1$$

Note that the density is discontinuous, except if $\lambda =1$.

As long as the $X_i$'s are assumed independent, your joint likelihood function is correct (just make the $y$'s lower-case), and it is a standard case of a likelihood function "regulated" by an indicator function. Note that there may be a small conceptual hurdle: if we observe $y_i = c_i$, does this mean that $y_i = x_i$ or not? Since $X_i$ is continuous and the probability of it acquiring a specific value is zero, we treat the case $Y_i = c_i$ as implying that $Y_i \neq X_i$. Your indicator function is equivalent to $\Delta_i = \textbf{1} \{ y_i <c_i \}$, totally deterministic, given the sample.

If you want to proceed with estimation, you are presumed to know the $c$-constants (otherwise, you do not have enough data to estimate them). You also have a sample of $y_i$'s. Given these, you can create the indicator function series. Then the MLE for $\lambda$ will run as an iterative numerical estimation procedure. What, were you hopping to obtain an analytical solution?

To verify that your gradient is correct, assume that you obtain a sample of $y_i$'s in which $y_i<c_i,\; \forall i$. Then $\Delta_i =1,\; \forall i$ and the gradient becomes

$$ \frac{ \mathrm{d} \log \mathcal{L}(\lambda)}{ \mathrm{d} \lambda} = -\sum_{i = 1}^n \frac{( \lambda y_i - 1)}{\lambda } = 0 \Rightarrow \frac 1 {\lambda} = \frac 1n \sum_{i = 1}^n y_i $$

which is as it should, since, if all $y_i$ -realizations are below the ceiling, then the $Y_i$'s are treated as proper exponential variables themselves, and the ceilings, being non-binding, do not affect the estimation.

Related Question