Suppose $x_1 … x_n$ are the order statistics of an iid sample from a continuous distribution $F(x)$. Show that $P(X_k \le x) = P\{N(x) \ge k\}$ where $N(x)$, the number of sample values less than x, is binomial with parameters $n$ and probability $p = F(x)$.

Given the statement looks (i believe it to be) like this:

$$P(X_k \le x) = \sum_{k=0}^n\binom{n}{k}F(x)^k(1-F(x))^{n-k}$$

However, I do not see how $\sum_{k=0}^n\binom{n}{k}F(x)^k(1-F(x))^{n-k} = P\{N(x) \ge k\}$

Additional data:

Use the Q above to show that the density of $X_k$ is

$$p(x) = \binom{n}{k}F(x)^{k-1}(1-F(x))^{n-k}f(x)$$ where $f(x)$ is the density from $F(x)$. Verify the density using the multinomial argument:

$$n!\epsilon^n\prod_ip_{\theta}(x_i)$$

which is a general heuristic to deal with order statistics from an iid sample from a continuous density $p_{\theta}(x)$.

## Best Answer

You can proof this by showing that the events on the left side and the right side $$X_{(k)} \le x \qquad \text{and} \qquad N(x) \ge k$$ are the same event.

There is however a tricky detail which is that they are not exactly the same event.

You have on the left side an inequality that is not strict (less than

orequal) whereas on the right side you have a strict inequality with the definition of $N(x)$ being'the number of sample values less than $x$'.## Discrete distribution

For discrete distributions this difference in the events will result in the probabilities not being equal. Take for instance a sample of size one drawn from a Bernoulli distribution. Then $$P(X_{(1)} \leq 1) = 1 \qquad \text{and} \quad P(N(1) \geq 1) = 1-p$$

## Continuous distribution

To still make the proof for continuous probabilities, we could use as starting point one of the following alternative equations instead. These are made by replacing either the $\leq$ sign on the left by a $<$ sign, or the $<$ sign on the right by a $\leq$ sign such that the events are the same.

$$\begin{array}{}P(X_{(k)} < x) &=& P(N(x) \ge k) \\ P(X_{(k)} \le x) &=& P(N^\prime(x) \ge k) \\ \end{array}\\$$

with $N^\prime(x)$ meaning the number of sample values less than

or equal to$x$.For these two expressions, you can easily see that the events are the same because they imply each other (and also the events will be the same for the discrete case). Take for example the second statement:

The trick to complete the proof for the continuous distribution is that the probability of the

differentevents (with signs $\leq$ or $<$) are the same. We have$$P(X_{(k)} \leq x) = P(X_{(k)} < x) \\ P(N(x) \ge k) = P(N^\prime(x) \ge k)$$

The reason is because the probability of the complement of the events is zero. For instance $$\begin{array}{} P(X_{(k)} \leq x) - P(X_{(k)} < x) &=& P(\lbrace X_{(k)} \leq x \rbrace \setminus \lbrace X_{(k)} < x \rbrace)\\ &=& P(X_{(k)} = x) \\ &=& 0 \end{array}$$

## Summarizing

$$\begin{array}{ccc} \rlap{\overbrace{\phantom{P(X_{(k)} \leq x ) = P(X_{(k)} < x)}}^{\substack{\text{different events} \\ \text{but same probability} \\ \text{for continuous distributions}}}} P(X_{(k)} \leq x ) &=& \underbrace{P(X_{(k)} < x) = P(N(x) \ge k)}_{\text{same events}}\\ && \overbrace{P(X_{(k)} \le x) = P(N^\prime(x) \ge k)}_{} &=& P(N(x) \ge k \llap{\underbrace{\phantom{P(N^\prime(x) \ge k) = P(N(x) \ge k}}_{\substack{\text{different events} \\ \text{but same probability} \\ \text{for continuous distributions}}}}) \end{array}$$

So we have $P(X_{(k)} \le x) = P(N(x) \ge k)$ not entirely because the events are the same, but because they are events with the same probability, which is because the difference between the events has probability zero for continuous distributions.