Distribution of the maximum distance between uniformly distributed random variables

probabilityprobability distributionsstatisticsuniform distribution

I have a question regarding the distribution of the maximum distance between random variables. I would formalize the problem as follow :

Let $n \geq 2$ be an integer and $X_1, …, X_n$ be iid random variables following distribution $\mathcal{U}([a, b])$. For some $\delta \in [0, ~b-a]$, what is the value of \begin{equation}\mathbb{P}\left(\underset{1\leq i < j \leq n}{\max}|X_i-X_j|\leq \delta\right)\end{equation}

I have considered 3 ways of doing so :

  1. I know the density of $|X_i-X_j|$ is $x \mapsto \frac{2(b-a)-2x}{(b-a)^2}1_{[0,~b-a]}(x)$. I am however stuck as in my case a maximum appears and there is no independance between all the $|X_i-X_j|$.

  2. I have tried ordering the variables as $X_{(1)}, …, X_{(n)}$ such that $X_{(1)}\leq … \leq X_{(n)}$ to rewrite the quantity as $\mathbb{P}\left(\underset{1 \leq k < n}{\max}\left(X_{(k+1)}-X_{(k)}\right)\leq \delta\right)$.
    I know from this wikipedia page that assuming $a=0$ and $b=1$ sets the distribution of $X_{(k+1)}-X_{(k)}$ to be a Beta with parameters $(1,n)$. I am however still stuck because of that maximum and the absence of independence.

  3. I have seen on the thread Probability distribution of the difference of two uniform variables an interesting geometrical interpretation, though it looked tedious to apply with more variates and I haven't tried going in this direction.

I have done some empirical testing for variables over $[0,1]$, and the Gumbel distribution seems to be an excellent approximation, getting even better when $n$ gets larger (see fig:Histogram and infered Gumbel Distribution for an example with $n=10$). I guess it is fair since I'm trying to evaluate the distribution of a maximum.

Do you have any idea on how to derive the distribution of $\underset{1\leq i < j \leq n}{\max}|X_i-X_j|$ ?

Best Answer

Note that $\max_{1\le i, j \le n} |X_i-X_j| = X_{(n)}-X_{(1)}$.

Assume $[a,b]=[0,1]$; in general a recentering and rescaling reduces to this case. There are $n+1$ gaps between the numbers $0, X_{(1)},\ldots,X_{(n)}, 1$; the joint distribution of those gaps is the Dirichlet distribution with parameters $(1,1,\ldots,1)$. Hence your $X_{(n)}-X_{(1)}$, which is the sum of the middle $n-1$ of the gaps, is Beta distributed with parameters $(n-1,2)$. See https://en.wikipedia.org/wiki/Order_statistic#Order_statistics_sampled_from_a_uniform_distribution .