Exercise about Order statistics from uniform distribution

exponential distributionorder-statisticsprobabilityself-studyuniform distribution

I'm trying to solve an exercise about order statistics.

The exercise is the following:

Let $U_{(1)}< \ldots <U_{(n)}$ be the order statistics from Uniform distribution U(0,1).

Show that $(-\log[U_{(r)}/U_{(r+1)}]^r) \overset{\underset{\mathrm{d}}{}}{=}Z_{n-r+1}$ where $Z_1, Z_2, \ldots ,Z_n$ are iid from $\rm Exp(1).$

And the solution shows first that $(-\log U_{(r)}) \overset{\underset{\mathrm{d}}{}}{=}\frac{1}{n}Z_1+\cdots+\frac{1}{r}Z_{n-r+1}$ where $Z_1,\ldots,Z_n$ are iid from $\rm Exp(1) $and I also understand so far.

And then the solution says that $(-\log[U_{(r)}/U_{(r+1)}]) \overset{\underset{\mathrm{d}}{}}{=}(\frac{1}{n}Z_1+\cdots+\frac{1}{r}Z_{n-r+1})-(\frac{1}{n}Z_1+\cdots+\frac{1}{r+1}Z_{n-r})=\frac{1}{r}Z_{n-r+1}$. Therefore $(-\log[U_{(r)}/U_{(r+1)}]^r) \overset{\underset{\mathrm{d}}{}}{=}Z_{n-r+1}$.

But can we do such a simple substraction?

Anyway I know that the conclusion is true from a famous(?) result that

enter image description here

(This is one of the exercise from the famous book "Introduction to mathematical statistics" by Hogg.)

From this, we can know that $(-\log U_{(r)}) \overset{\underset{\mathrm{d}}{}}{=}X_{(n-r+1)}$ where $X_1,\dots,X_n$ are iid from Exp(1). and $(-\log[U_{(r)}/U_{(r+1)}]^r) \overset{\underset{\mathrm{d}}{}}{=}r(X_{(n-r+1)}-X_{(n-r)})\overset{\underset{\mathrm{d}}{}}{=}\mathrm{Exp}(1). $

However I couldn't understand the substraction claim from the solution.

Best Answer

"Can we do such a simple subtraction?", in general, we can't, because one cannot conclude $X - Y \overset{d}{=} \xi - \eta$ from $X \overset{d}{=} \xi$ and $Y \overset{d}{=} \eta$ without knowing the joint distribution of $(X, Y)$. On the other hand, the catch here is that the "$\overset{d}{=}$" relation holds not only marginally, but jointly. In fact, the result that you cited from Hogg's book can be written as: \begin{align*} \begin{bmatrix} Z_1 \\ Z_2 \\ Z_3 \\ \vdots \\ Z_n \end{bmatrix} := \begin{bmatrix} nX_{(1)} \\ (n - 1)(X_{(2)} - X_{(1)}) \\ (n - 2)(X_{(3)} - X_{(2)}) \\ \vdots \\ X_{(n)} - X_{(n - 1)} \end{bmatrix} , \tag{1}\label{1} \end{align*} where $X_1, \cdots, X_n \text{ i.i.d.} \sim \mathrm{Exp}(1)$ and $Z_1, \ldots, Z_n \text{ i.i.d.} \sim \mathrm{Exp}(1)$. In the post, you indicated that you understood the relation $-\log U_{(r)} \overset{d}{=} X_{(n - r + 1)}$ for $r = 1, \ldots, n$. But in fact, you can achieve something stronger, which is
\begin{align*} \begin{bmatrix} -\log U_{(1)} \\ -\log U_{(2)} \\ -\log U_{(3)}\\ \vdots \\ -\log U_{(n)} \end{bmatrix} \overset{d}{=} \begin{bmatrix} X_{(n)} \\ X_{(n - 1)} \\ X_{(n - 2)} \\ \vdots \\ X_{(1)} \end{bmatrix}. \tag{2}\label{2} \end{align*} Suppose for the moment we have derived $\eqref{2}$, then the "subtraction" operation becomes legit thanks to $\eqref{2}$ is a joint relation. In general, if $(\xi_1, \ldots, \xi_n) \overset{d}{=} (\eta_1, \ldots, \eta_n)$, then for any $n$-variate Borel function $f$, one has $f(\xi_1, \ldots, \xi_n) \overset{d}{=} f(\eta_1, \ldots, \eta_n)$. To see this, for any $x \in \mathbb{R}$, we have \begin{align*} & P(f(\xi_1, \ldots, \xi_n) \leq x) \\ ={} & P((\xi_1, \ldots, \xi_n) \in f^{-1}((-\infty, x])) \\ ={} & P((\eta_1, \ldots, \eta_n) \in f^{-1}((-\infty, x])) \tag{3}\label{3} \\ ={} & P(f(\eta_1, \ldots, \eta_n) \leq x), \\ \end{align*} which shows that $f(\xi_1, \ldots, \xi_n)$ and $f(\eta_1, \ldots, \eta_n)$ share the common distribution function, whence $f(\xi_1, \ldots, \xi_n) \overset{d}{=} f(\eta_1, \ldots, \eta_n)$. It is in $\eqref{3}$ where we used the condition $(\xi_1, \ldots, \xi_n) \overset{d}{=} (\eta_1, \ldots, \eta_n)$. As stated at the beginning of the answer, $\eqref{3}$ usually breaks if we are only offered $n$ marginal "$\overset{d}{=}$" identities $\xi_i \overset{d}{=} \eta_i, i = 1, \ldots, n$.


As I may have cleared up your main confusion now, it remains to show that why $\eqref{2}$ holds. To this end, first note that by the i.i.d. assumption of $\{X_1, \ldots, X_n\}$ and the probability integral transform property (the distribution function of a standard exponential random variable is $F(x) = 1 - e^{-x}$), $F(X_1) = 1 - \exp(-X_1), \ldots, F(X_n) = 1 - \exp(-X_n) \text{ i.i.d.} \sim U(0, 1)$, i.e., if denoting $1 - \exp(-X_i)$ by $1 - U_i$, then
\begin{align*} \begin{bmatrix} X_1 \\ X_2 \\ \vdots \\ X_n \end{bmatrix} = \begin{bmatrix} -\log U_1 \\ -\log U_2 \\ \vdots \\ -\log U_n \end{bmatrix}, \tag{4}\label{4} \end{align*} where $U_1, \ldots, U_n \text{ i.i.d. } \sim U(0, 1)$.

Now since function $x \mapsto -\log x$ is strictly decreasing, $\eqref{4}$ then implies that \begin{align*} \begin{bmatrix} X_{(n)} \\ X_{(n - 1)} \\ X_{(n - 2)} \\ \vdots \\ X_{(1)} \end{bmatrix} \overset{d}{=} \begin{bmatrix} -\log U_{(1)} \\ -\log U_{(2)} \\ -\log U_{(3)}\\ \vdots \\ -\log U_{(n)} \end{bmatrix}, \end{align*} which is $\eqref{2}$.

P.S., $Y_j$ in Hogg's exercise can be more explicitly written as \begin{align*} Y_r = \sum_{j = 1}^r \frac{Z_j}{n + 1 - j}, \quad j = 1, \ldots, n. \end{align*} This is known as Rényi representation of ordered exponential random variables. An intuitive argument of stochastic process flavor to obtain this representation can be found in Example 2.28 of Statistical Models by A. C. Davison.

Related Question