The probability that ${x_1}^{1/k_1}+{x_2}^{1/k_2}+\dotsb+{x_n}^{1/k_n}$ is less than $1$ – combinatorial proof

multinomial-coefficientsprobability

A friend and I were playing around with Beta integrals and we noticed the following fact.

Choose some positive integers $k_1,k_2,\dots,k_n$, and let $x_1,x_2,\dots,x_n$ be independent uniform random variables between $0$ and $1$. Then the probability that
${x_1}^{1/k_1}+{x_2}^{1/k_2}+\dotsb+{x_n}^{1/k_n}\le1$ is $$\frac{k_1!k_2!\dotsm k_n!}{(k_1+k_2+\dotsb+k_n)!}.$$

(Sidenote: In fact, using the Gamma extension of the factorial, this works for noninteger $k_i$ as well. This has the nice corollary that the volume of a sphere is $2^3\frac{\frac12!^3}{\frac32!}r^3=\frac43\pi r^3$. In fact, this yields the volume formulas for $n$-balls in any dimension.)

This result is particularly clean in that it can be expressed as a multinomial coefficient: it's $\dfrac1{\binom{k_1+k_2+\dotsb+k_n}{k_1,k_2,\dots,k_n}}$. (The multinomial coefficient describes the number of ways of distributing $k_1+k_2+\dotsb+k_n$ objects into bins of sizes $k_1,k_2,\dots,k_n$.) In particular, it is always the reciprocal of an integer. Given the combinatorial nature of the solution, we should expect a combinatorial proof. Is there one? (One can prove this using Beta integrals, as I noted above, but I wouldn't call this combinatorial.)

In other words: is there an elementary proof of the identity $P({x_1}^{1/k_1}+{x_2}^{1/k_2}+\dotsb+{x_n}^{1/k_n}\le1)=\frac{k_1!k_2!\dotsm k_n!}{(k_1+k_2+\dotsb+k_n)!}$?

(P.S. The case where $k_1=\dotsb=k_n=1$ is well-known.)

Best Answer

I realized the answer soon after writing the question. Since the site explicitly encourages people answering their own question, I figured I would write it up.

First, notice that if $x_1,\dots,x_k$ are independent uniform from $[0,1]$, then $\max\{x_1,\dots,x_k\}$ has the same distribution as $x^{1/k}$ where $x$ is uniform in $[0,1]$. To prove this, note that for any $c\in[0,1]$, \begin{array}{c} P(\max\{x_1,\dots,x_k\}\le c)=\prod_{i=1}^k P(x_i\le c)=c^k\\ P(x^{1/k}\le c)=P(x\le c^k)=c^k \end{array} Thus the cdfs agree and therefore so do the distributions.

Therefore, our question is equivalent to proving that \begin{align*} P\big(&\max\{x_{1,1},\dots,x_{1,k_1}\}+\max\{x_{2,1},\dots,x_{2,k_2}\}+{}\dotsb\\ &{}+\max\{x_{n,1},\dots,x_{n,k_n}\}\le1\big)=\frac{k_1!k_2!\dotsm k_n!}{(k_1+k_2+\dotsb+k_n)!} \end{align*} for $k_1+k_2+\dotsb+k_n$ independent uniform variables $x_{1,1},\dots,x_{n,k_n}$.


On the other hand, we can work backwards from the solution. Suppose again we choose $k_1+k_2+\dotsb+k_n$ independent uniform variables $x_{1,1},\dots,x_{n,k_n}$, as above. Then the probability that all $x_{1,i}$ are less than all $x_{2,i}$, which are less than all $x_{3,i}$, etc., is also $\frac{k_1!k_2!\dotsm k_n!}{(k_1+k_2+\dotsb+k_n)!}$. This is because there are $k_1!k_2!\dotsm k_n!$ ways to permute the variables under this restriction, and $(k_1+k_2+\dotsb+k_n)!$ permutations total.


Let's use vector notation as a sort of shorthand. Let $\max\vec v$ mean the maximum coordinate of $\vec v$, and let $\vec u\le\vec v$ mean that every coordinate of $\vec u$ is less than every coordinate of $\vec v$. Then $$A=\{\vec{\mathbf x}\in[0,1]^{k_1}\dotsm[0,1]^{k_n}:\max\vec x_1+\dotsb+\max\vec x_n\le1\}$$ is the set on which the procedure in the first section succeeds, and $$B=\{\vec{\mathbf x}\in[0,1]^{k_1}\dotsm[0,1]^{k_n}:\vec x_1\le\dotsb\le\vec x_n\}$$ is the set on which the second set succeeds. (It's worth noting here that $\binom{k_1+\dotsb+k_n}{k_1,\dots,k_n}$ congruent copies of $B$ fill up $[0,1]^{k_1+\dotsb+k_n}$, from permuting the coordinates.) We wish to show that $\operatorname{Vol}(A)=\operatorname{Vol}(B)$. I claim that I can transform $B$ into $A$ by means of a bijective piecewise-linear transformation with determinant $1$ everywhere, finishing the proof.

Let $T:B\to A$ be the piecewise-linear transformation defined by taking every random variable of each group and subtracting the maximum value of the previous group. In vector shorthand, this is $$T:(\vec x_1,\dots,\vec x_n)\mapsto(\vec x_1,\vec x_2-\max\vec x_1,\dots,\vec x_n-\max\vec x_{n-1}).$$ Expanding out the shorthand, this means $$T:(\dots,x_{i,j},\dots)\mapsto(\dots,x_{i,j}-\max\{x_{i-1,1},\dots,x_{i-1,k_{i-1}}\},\dots).$$ By the definition of $B$, each coordinate maps into $[0,1]$. This is bijective since we can define $T^{-1}:A\to B$ by adding the maxima of all previous groups: \begin{align*} T^{-1}:(\vec x_1,\dots,\vec x_n)\mapsto(&\vec x_1,\vec x_2+\max\vec x_1,\dots,\\&\ \vec x_n+\max\vec x_{n-1}+\dotsb+\max\vec x_1). \end{align*} $T$ is piecewise linear since $\max$ is piecewise linear on its arguments. Except on a set of measure zero, at every point $\vec{\mathbf x}=(\vec x_1,\dots,\vec x_n)$, the transformation $T$ is a triangular matrix (exactly which matrix depends on the the maxima of each group in $\vec{\mathbf x}$), with $1$s on its diagonal; therefore, since $T$ is bijective, it preserves volume. Thus $\operatorname{Vol}(A)=\operatorname{Vol}(B)$ and the theorem is proved.



PS: A few hours after writing this, I found a second proof. Let $\Delta_k$ be $\{(x_1,\dots,x_k)\in[0,1]^k:x_1+\dotsb+x_k\le1\}$. Then if we pick a point uniformly at random from within $\Delta_k$, then it can be shown from a geometric scaling argument that $x_1+\dotsb+x_k$ is distributed like $x^{1/k}$. Furthermore, it is well-known (and can be shown using similar logic to the above) that $\operatorname{Vol}(\Delta_k)=1/k!$. From there, the probability we want is simply $$\frac{\operatorname{Vol}(\Delta_{k_1+\dotsb+k_n})}{\operatorname{Vol}(\Delta_{k_1}\times\dotsm\times\Delta_{k_n})}=\frac{1/(k_1+\dotsb+k_n)!}{1/k_1!\dotsm 1/k_n!}.$$ In my eyes the first proof is still more "combinatorial," but I thought I'd post this here too.

Related Question