Probability Theory – Probability Distribution of $X_N$, $N = \min\{n \geq 2: X_n = $ Second Largest of $ X_1, \ldots, X_n\}$

probability theoryrandom variables

Given a sequence $X_1, X_2, \ldots$ of independent, continuous random variables with the same distribution function $F$ and density function $f$, let $N = min\{n \geq 2: X_n = $ second largest of $ X_1, \ldots, X_n\}$.

Denote $X_N$ as the first random variable which, at the time of observation, is the second largest of those observed so far. Find the probability density function $f_{X_N}(x)$.

I'm extremely lost on this problem; all I've come up with thus far is to look at random variables $X_M$, the maximum of all observed so far, and $X_{M-1}$, the 'second-to-maximum' but I have been unable to find their respective distribution/density functions.

Thanks in advance for any and all help.

Best Answer

Let $f(x)$ be the density function of the $X_i$, and $F(x)$ their cdf.

Let $Z$ be the maximum of the $X_i$. For completeness we find the distribution of $Z$, though that is likely familiar to you.

The event $Z\le z$ happens precisely if all the $X_i$ are $\le z$. This has probability $(F(z))^n$. Thus $$F_Z(z)=(F(z))^n.$$ For the density function of $Z$, differentiate. We get $f_Z(z)=nf(z)(F(z))^{n-1}$.


Let $Y$ be the second largest of the $X_i$. We give a highly informal derivation of the density function $f_Y(y)$ of $Y$.

Let $dy$ be "small." We find the probability that $Y$ lies between $y$ and $y+dy$. This will be approximately $f_Y(y)\,dy$.

Neglecting terms in higher powers of $dy$, the probability that the second largest lies between $y$ and $y+dy$ is the probability that some $X_i$ lies in this interval times the probability that $n-2$ of the $X_i$ lie below $y$ and $1$ lies above $y+dy$.

The $X_i$ that lies between $y$ and $y+dy$ can be chosen in $n$ ways. The probability it lies in the interval is approximately $F(y)\,dy$.

The probability that $n-2$ of the remaining $X_i$ lie below $y$, and $1$ lies above, is $\binom{n-1}{1}(1-F(y))^1 (F(y))^{n-2}$. "Thus," $$f_Y(y)=n\binom{n-1}{1}f(y)(1-F(y))(F(y))^{n-1}.$$ For the cdf $F_Y(y)$, integrate from $-\infty$ to $y$. The integral is in principle easy, make the substitution $u=F(y)$.

Related Question