Probability – Finding UMVUE of Difference of Parameters of Two Exponential Distribution Random Variables

exponential distributionmissing dataprobabilityself-studyunbiased-estimator

Let $X_{1}, \dots, X_{n}$ be i.i.d. having the exponential distribution $Exp\left(0, \theta_{x}\right)$ with $\theta_{x}>0$, and $Y_{1}, \dots, Y_{n}$ be i.i.d. having the exponential distribution $Exp\left(0,\theta_{y}\right)$ with $\theta_{y}>0$.
Assume that $X_{i}$'s and $Y_{j}$'s are independent, but they are unobservable.

Suppose that our sample is $\left(Z_{1}, \Delta_{1}\right), \dots,\left(Z_{n}, \Delta_{n}\right)$, where $Z_{i}= X_{i}(1-\Delta_{i}) + Y_{i}\Delta_{i}$ and $\Delta_{i}=I(X_i \ge Y_i)$ for $i=1, \dots, n$.
Please find an unbiased estimator and the UMVUE of $\theta_{x}-\theta_{y}$.

Note that the probability density function (p.d.f.) of $Exp(a,\theta)$ is $\theta^{-1}e^{-(x-a)/\theta}I(x>a)$.

The parameter, $\theta$, is commonly called the MTBF (mean time between failures) or MTTF (mean time to fix).
The inverse of the parameter, $1/\theta$, is equal to the hazard rate.

This question is a part of Exercise 3.9 in Shao Jun (2003).

The following is my attempt.

Here is the skeleton of my solution.
We first find the complete and sufficient statistics $T$, and then try to construct the unbiased estimator $\widehat{S}$, then the UMVUE would be $\mathbb{E}[\widehat{S}\vert T]$.

Let $U_i = X_i – Y_i$.
When $u\le 0$, we have
\begin{align*}
P(U_i \le u)
& = P(X_i- Y_i \le u) \\
& = \int_{0}^{\infty} \int_{x-u}^{\infty} \frac{1}{\theta_x\theta_y} \exp\left\{ -\frac{x}{\theta_x} -\frac{y}{\theta_y} \right\} d y d x \\
& = \exp (u / \theta_y) \frac{\theta_y}{\theta_x + \theta_y} .
\end{align*}

This implies $P(\Delta_i = 1) = \frac{\theta_x}{\theta_x + \theta_y}$.
Further,
\begin{align*}
P ( Z_i \le z, \Delta_i = 1 ) & = \frac{\theta_x}{\theta_x + \theta_y} – \frac{\theta_x}{\theta_x + \theta_y} \exp \left( – \frac{\theta_x + \theta_y}{\theta_x \theta_y} z\right) \triangleq P_{1,i} , \\
% & =
P ( Z_i \le z, \Delta_i = 0 ) & = \frac{\theta_y}{\theta_x + \theta_y} – \frac{\theta_y}{\theta_x + \theta_y} \exp \left( – \frac{\theta_x + \theta_y}{\theta_x \theta_y} z\right) \triangleq P_{0,i} .
\end{align*}

Then $Z_i$ and $\Delta_i$ are independent, $Z_i\sim E\left(0, \frac{\theta_x\theta_y}{\theta_x+\theta_y}\right)$, and $P_{Z_i,\Delta_i} (z,\delta) = P_{1,i}^{\delta} P_{0,i}^{1-\delta}$.

Notice that the p.d.f. of the joint distribution of $Z_i$'s and $\Delta_i$'s is given by
\begin{align*}
& p\left( Z_1 = z_1, \dots, Z_n = z_n, \Delta_1 = \delta_1, \dots, \Delta_n = \delta_n \right) \\
& = \prod_{i=1}^n \left[ \frac{\theta_x + \theta_y}{\theta_x \theta_y} \exp\left( – \frac{\theta_x + \theta_y}{\theta_x \theta_y} z_i \right) \left\{ \frac{\theta_x}{\theta_x + \theta_y} I (\delta_i = 1) + \frac{\theta_y}{\theta_x + \theta_y} I (\delta_i = 0)\right\} \right] I(z_i>0)\\
& = \left(\frac{\theta_x + \theta_y}{\theta_x \theta_y}\right)^{n} \exp\left( – \frac{\theta_x + \theta_y}{\theta_x \theta_y} \sum_{i=1}^n z_i \right) \left( \frac{\theta_x}{\theta_x + \theta_y} \right)^{\sum_{i=1}^n \delta_i} \left( \frac{\theta_y}{\theta_x + \theta_y} \right)^{n- \sum_{i=1}^n \delta_i} I(z_{(1)}>0) \\
& = \exp\left( – \frac{\theta_x + \theta_y}{\theta_x \theta_y} \sum_{i=1}^n z_i \right) \theta_x^{-n+\sum_{i=1}^n \delta_i} \theta_y^{- \sum_{i=1}^n \delta_i} I(z_{(1)}>0)
\end{align*}

Therefore, the joint distribution of $Z_i$'s and $\Delta_i$'s is from an exponential family with $T = (\sum_{i=1}^n \Delta_i, \sum_{i=1}^n Z_i)$ as the complete and sufficient statistic.

I don't know how to continue and if it will actually exist.
Thanks for any suggestions and answers.


(update)

Thanks for the suggestion from @Xi'an.

If we are interested in $1/\theta_x – 1/\theta_y$, we could construct an unbiased estimator according to $\frac{n-1}{n} \mathbb{E} \frac{\sum_{i=1}^n (1-\Delta_i)}{\sum_{i=1} Z_i} – \frac{n-1}{n} \mathbb{E} \frac{\sum_{i=1}^n \Delta_i}{\sum_{i=1} Z_i} = 1/\theta_x – 1/\theta_y$.

But in this question, we focus on $\theta_x -\theta_y$
Although $\mathbb{E}[X_i] = \theta_x$, we have $\mathbb{E}[X_i \vert Z_i,\Delta_i]$ is a function of $\theta_x$, $Z_i$ and $\Delta_i$.
However, $\mathbb{E}[X_i \vert Z_i,\Delta_i]$ is non-linear in terms of $\theta_x$.
So, I don't know how to continue.

Best Answer

Changing the question in two different ways allows for some answers:

  1. If $\theta_x$ and $\theta_y$ are rate rather than scale parameters, $$ \frac{n-1}{n} \frac{\sum_{i=1}^n (1-\Delta_i)}{\sum_{i=1} Z_i} - \frac{n-1}{n} \frac{\sum_{i=1}^n \Delta_i}{\sum_{i=1} Z_i}\tag{1} $$ is an unbiased estimator of $1/\theta_x - 1/\theta_y$ and since it only depends on $\mathbf T$, it is the UMVUE.

  2. If instead $Z=\max\{X,Y\}$, with $\theta_x$ and $\theta_y$ scale parameters, the conditional distribution of $Z$ conditional on $\Delta=1$. Since $$\mathbb P(\Delta=1)=\frac{\theta_x}{\theta_x+\theta_y}$$ we have \begin{align} \mathbb P(Z\le z|\Delta=1)&=\frac{\theta_x+\theta_y}{\theta_x}\mathbb P(Z\le z,\Delta=1)\\ &=\frac{\theta_x+\theta_y}{\theta_x}\mathbb P(X\le z,X>Y)\\ &=\frac{\theta_x+\theta_y}{\theta_x}\int_0^z\int_0^x \frac1{\theta_x\theta_y} \exp\{-x/\theta_x-y/\theta_y\}\text dx\text dy\\ &=\frac{\theta_x+\theta_y}{\theta_x}\int_0^z(1-\exp\{-x/\theta_y\})\frac{\exp\{-x/\theta_x\}}{\theta_x}\text dx\\ &=\frac{\theta_x+\theta_y}{\theta_x}[1-\exp\{-z/\theta_x\}]-\\ &\qquad\frac{\theta_x+\theta_y}{\theta_x}(\theta_x^{-1}+\theta_y^{-1})^{-1} [1-\exp\{-z(\theta_x^{-1}+\theta_y^{-1})\}]\\ &=\frac{\theta_x+\theta_y}{\theta_x}[1-\exp\{-z/\theta_x\}]- \frac{\theta_y}{\theta_x}[1-\exp\{-z(\theta_x+\theta_y)/\theta_x\theta_y\}] \end{align} This is a signed mixture of two Exponential distributions $$\frac{\theta_x+\theta_y}{\theta_x}\mathcal Exp(\theta_x)-\frac{\theta_y}{\theta_x}\mathcal Exp(\theta_x\theta_y/(\theta_x+\theta_y))$$ which is illustrated by the fit in the following graphs:

enter image description here

based on $n=10^6$ simulations from $\mathcal Exp(10)$ and $\mathcal Exp(1/10)$ samples. This distribution has mean \begin{align}\mathbb E[Z\le z|\Delta=1] &=\frac{\theta_x+\theta_y}{\theta_x}\theta_x-\frac{\theta_y}{\theta_x}\frac{\theta_x\theta_y}{\theta_x+\theta_y}\\ &=\theta_x+\theta_y\left[1-\frac{\theta_y}{\theta_x+\theta_y}\right]\\ &=\theta_x+\frac{\theta_y\theta_y}{\theta_x+\theta_y}\end{align} The second term above is symmetric in $(\theta_x,\theta_y)$. Therefore, $$\mathbb E[Z\le z|\Delta=1]-\mathbb E[Z\le z|\Delta=0]=\theta_x-\theta_y$$ which leads immediately to an unbiased estimator based on $(\mathbf X,\boldsymbol \Delta)$: $$\dfrac{\sum_{i=1}^n Z_i\Delta_i}{\sum_{i=1}^n\Delta_i}- \dfrac{\sum_{i=1}^n Z_i\{1-\Delta_i\}}{\sum_{i=1}^n\{1-\Delta_i\}}\tag{2}$$ although I cannot tell about about (2) being UMVUE as the $Z_i$'s are not from an exponential family.