[Math] Probability of the maximum (Levy Stable) random variable in a list being greater than the sum of the rest

analysisprobabilityreference-request

Original post on Mathoverflow here.

Given a list of identical and independently distributed Levy Stable random variables, $(X_0, X_1, \dots, X_{n-1})$, what is the is the probability that the maximum exceeds the sum of the rest? i.e.:

$$ M = \text{Max}(X_0, X_1, \dots, X_{n-1}) $$
$$ \text{Pr}( M > \sum_{j=0}^{n-1} X_j – M ) $$

Where, in Nolan's notation, $X_j \in S(\alpha, \beta=1, \gamma, \delta=0 ; 0)$, where $\alpha$ is the critical exponent, $\beta$ is the skew, $\gamma$ is the scale parameter and $\delta$ is the shift. For simplicity, I have taken the skew parameter, $\beta$, to be 1 (maximally skewed to the right) and $\delta=0$ so everything has its mode centered in an interval near 0.

From numerical simulations, it appears that for the region of $0 < \alpha < 1$, the probability converges to a constant, irregardless of $n$ or $\gamma$. Below is a plot of this region for $n=500$, $0< \alpha < 1$, where each point represents the result of 10,000 random draws. The graph looks exactly the same for $n=100, 200, 300$ and $400$.

alt text

For $1 < \alpha < 2$ it appears to go as $O(1/n^{\alpha – 1})$ (maybe?) irregardless of $n$ or $\gamma$. Below is a plot of the probability for $\alpha \in (1.125, 1.3125)$ as a function of $n$. Note that it is a log-log plot and I have provided the graphs $1/x^{.125}$ and $1/x^{.3125}$ for reference. It's hard to tell from the graph unless you line them up, but the fit for each is a bit off, and it appears as if the (log-log) slope of the actual data is steeper than my guess for each. Each point represents 10,000 iterations.

alt text

For $\alpha=1$ it's not clear (to me) what's going on, but it appears to be a decreasing function dependent on $n$ and $\gamma$.

I have tried making a heuristic argument to the in the form of:

$$\text{Pr}( M > \sum_{j=0}^{n-1} X_j – M) \le n \text{Pr}( X_0 – \sum_{j=1}^{n-1} X_j > 0 )$$

Then using formula's provided by Nolan (pg. 27) for the parameters of the implicit r.v. $ U = X_0 – \sum_{j=1}^{n-1} X_j$ combined with the tail approximation:

$$ \text{Pr}( X > x ) \sim \gamma^{\alpha} c_{\alpha} ( 1 + \beta ) x^{-\alpha} $$
$$ c_{\alpha} = \sin( \pi \alpha / 2) \Gamma(\alpha) / \pi $$

but this leaves me nervous and a bit unsatisfied.

Just for comparison, if $X_j$ were taken to be uniform r.v.'s on the unit interval, this function would decrease exponentially quickly. I imagine similar results hold were the $X_j$'s Gaussian, though any clarification on that point would be appreciated.

Getting closed form solutions for this is probably out of the question, as there isn't even a closed form solution for the pdf of Levy-Stable random variables, but getting bounds on what the probability is would be helpful. I would appreciate any help with regards to how to analyze these types of questions in general such as general methods or references to other work in this area.

If this problem is elementary, I would greatly appreciate any reference to a textbook, tutorial or paper that would help me solve problems of this sort.

UPDATE: George Lowther and Shai Covo have answered this question below. I just wanted to give a few more pictures that compare their answers to some of the numerical experiments that I did.

Below is the probability of the maximum element being larger than the rest for a list size of $n=100$ as a function of $\alpha$, $\alpha \in (0,1)$. Each point represents 10,000 simulations.

alt text

Below are two graphs for two values of $\alpha \in \{1.53125, 1.875\}$. Both have the function $ (2/\pi) \sin(\pi \alpha / 2) \Gamma(\alpha) n (( \tan(\pi \alpha/2) (n^{1/\alpha} – n))^{-\alpha} $ with different prescalars in front of them to get them to line up ( $1/4$ and $1/37$, respectively) superimposed for reference.

alt text

As George Lowther correctly pointed out, for the relatively small $n$ being considered here, the effect of the extra $n^{1/\alpha}$ term (when $1 < \alpha < 2$) is non-negligible and this is why my original reference plots did not line up with the results of the simulations. Once the full approximation is put in, the fit is much better.

When I get around to it, I will try and post some more pictures for the case when $\alpha=1$ as a function of $n$ and $\gamma$.

Best Answer

What you are asking for is a special case of the joint distribution of the maximum and sum of a sequence of random variables. The fact that the variables are stable simplifies this and allows us to compute the joint distribution in the limit as n goes to infinity. This limit can be described in terms of the joint distribution of the value of a stable Lévy process at time $t=1$ and its maximum jump size over $t\le1$. Lévy processes are right continuous with left limits (cadlag) and have stationary independent increments. Stable Lévy processes have increments with stationary distributions.

If $X_i$ each have the $\alpha$-stable distribution denoted by $S(\alpha,\beta,\gamma,\delta)=S(\alpha,\beta,\gamma,\delta;0)$ in the text you link, then the sum $S=\sum_{i=0}^{n-1}X_i$ has the $S(\alpha,\beta,n^{1/\alpha}\gamma,\delta_n)$ distribution where $$ \delta_n=\begin{cases} n\delta + \gamma\beta\tan\frac{\pi\alpha}{2}(n^{1/\alpha}-n),&\textrm{if }\alpha\not=1,\\ n\delta +\gamma\beta\frac{2}{\pi}n\log n,&\textrm{if }\alpha=1. \end{cases}\qquad\qquad{\rm(1)} $$ This is equation (1.9) on page 20 of the linked notes. It will be helpful to rescale the sum to remove the dependence on n. The random variable $Y=n^{-1/\alpha}(S-\delta_n)$ has the $S(\alpha,\beta,\gamma,0)$ distribution (by Proposition 1.16).

Now, to relate this to Lévy processes. Let $Z_t$ be the Lévy process with $Z_1\sim Y$. We can take $X_i=n^{1/\alpha}(Z_{(i+1)/n}-Z_{i/n})+\delta_n/n$, which will be independent with the required $S(\alpha,\beta,\gamma,\delta)$ distribution. In fact, in this case we have $Z_1=Y$ and $S=n^{1/\alpha}Z_1+\delta_n$. Writing $M=\max_iX_i$ and $\Delta Z^*_t$ for the maximum of $\Delta Z_s=Z(s)-Z(s-)$ over $s\le t$ gives $n^{-1/\alpha}(M-\delta_n/n)\to\Delta Z^*_1$ as n goes to infinity. We can then compute the probability P that you are asking for to leading order as n becomes large, $$ \begin{align} P&\equiv\mathbb{P}\left(M > S-M\right)\\ &=\mathbb{P}\left(2n^{-1/\alpha}(M-\delta_n/n) > n^{-1/\alpha}(S-2\delta_n/n)\right)\\ &\sim\mathbb{P}\left(2\Delta Z^*_1 > Z_1+n^{-1/\alpha}\delta_n(1-2/n)\right) \end{align} $$ So, $$ P\sim\mathbb{P}\left(2\Delta Z^*_1 - Z_1 > n^{-1/\alpha}\delta_n\right) $$ and the leading term in the asymptotics for P are determined by the distribution of $2\Delta Z^*_1-Z_1$.

For $\alpha < 1$ it can be seen from (1) that $n^{-1/\alpha}\delta_n\to\gamma\beta\tan\frac{\pi\alpha}{2}$ as n goes to infinity, so we get a finite and positive limit $$ P\to\mathbb{P}\left(2\Delta Z^*_1-Z_1 > \gamma\beta\tan\frac{\pi\alpha}{2}\right) $$ as $n\to\infty$. The tangent has only appeared in the expression here because of the particular parameterization that we are using. It is a bit simpler if we rewrite this in terms of $\tilde Z_t=Z_t+\gamma\beta t\tan\frac{\pi\alpha}{2}$. Then, for the $\beta=1$ case you are using, $\tilde Z_t$ has support $[0,\infty)$ (Lemma 1.10 in the linked text). This means that $\tilde Z$ is a stable subordinator and the required probability is $$ P\to\mathbb{P}\left(2\Delta\tilde Z^*_1 > \tilde Z_1\right)=\frac{\sin(\pi\alpha)}{\pi\alpha}. $$ [I've been a bit sneaky here, and substituted in the expression that Shai Covo gave in his answer for $\mathbb{P}(2\Delta\tilde Z^*_1 > \tilde Z_1)$ and simplified a bit using Euler's reflection formula.]

On the other hand, if $\alpha > 1$ then (1) gives $n^{-1/\alpha}\delta_n\sim n^{1-1/\alpha}(\delta-\gamma\beta\tan\frac{\pi\alpha}{2})$ and, if $\alpha=1$, then $n^{-1/\alpha}\delta_n\sim\gamma\beta\frac{2}{\pi}\log n$. In this case, P tends to zero at a rate dependent on the tail of the distribution function of $2\Delta Z^*_1-Z_1$.

It is also possible to use the Lévy-Khintchine representation to compute the tail of the distribution function of $2\Delta Z^*_1-Z_1$. The jumps of a Lévy process are described by a Lévy measure $\nu$ on the real numbers, so that the jumps of size belonging to a set $A$ occur according to a Poisson process of rate $\nu(A)$. An $\alpha$-stable process has Lévy measure $d\nu(x)=c\vert x\vert^{-\alpha-1}\,dx$, where $c$ only depends on the sign of $x$. We can back out the value of $c$ by calculating the characteristic function of $Z_1$ using the Lévy-Khintchine formula (essentially the Fourier transform of $\nu$, see Wikipedia, which uses W for the Lévy measure) and comparing with expression (1.4) from your text. Going through this computation gives $$ d\nu(x)=\pi^{-1}\alpha\gamma^\alpha\sin\frac{\pi\alpha}{2}\Gamma(\alpha)(1+\beta{\rm sign}(x))\vert x\vert^{-\alpha-1}\,dx. $$ The number of jumps of size greater than $x$ is Poisson distributed of rate $\nu((x,\infty))$ giving, $$ \begin{align}\mathbb{P}(\Delta Z^*_1> x) &= 1-e^{-\nu((x,\infty))}\\ &\sim\nu((x,\infty))\\ &=\pi^{-1}\gamma^\alpha\sin\frac{\pi\alpha}{2}\Gamma(\alpha)(1+\beta)x^{-\alpha}. \end{align}\qquad{\rm(2)} $$ Comparing with your linked text (Theorem 1.12), this is exactly the same as the tail of $Z_1$! That is, $\mathbb{P}(\Delta Z^*_1 > x)\sim\mathbb{P}(Z_1 > x)$. This means that the tail of the distribution is entirely due to the occasional big jump of the Lévy process. This does actually correspond to experience. I have plotted Cauchy process sample paths before, and you do get some paths with a single big jump drowning out all other behaviour. We can also see why this should be the case. The density of the jump distribution is sub-exponential, proportional to $\vert x\vert^{-\alpha-1}$. So one big jump of size $x$ has a much bigger probability than n small jumps of size $x/n$. It also suggests that, in your calculation of $X_i$, the samples contributing to the probability come mainly from the case where all of them are reasonably close to the mean except for one extreme (positive or negative) $X_i$.

So, from the argument above, the leading asymptotic of the probability P comes from the case where $\Delta Z_1^*$ is very large compared to $Z_1-\Delta Z_1^*$, or when the largest negative jump $\Delta(-Z)^*_1$ is large compared to $Z_1+\Delta(-Z)^*_1$. This gives $$ \begin{align} \mathbb{P}(2\Delta Z^*_1-Z_1 > x) &\sim\mathbb{P}(\Delta Z^*_1 > x) +\mathbb{P}(\Delta (-Z)^*_1 > x)\\ &\sim 2\pi^{-1}\gamma^{\alpha}\sin\frac{\pi\alpha}{2}\Gamma(\alpha)x^{-\alpha}, \end{align} $$ where I substituted in (2). Plugging in the asymptotic form for $x=n^{-1/\alpha}\delta_n$ calculated above gives $$ P\sim2\pi^{-1}\gamma^{\alpha}\sin\frac{\pi\alpha}{2}\Gamma(\alpha)\left(\delta-\gamma\beta\tan\frac{\pi\alpha}{2}\right)^{-\alpha}n^{1-\alpha} $$ for $\alpha > 1$. This confirms your $O(1/n^{\alpha-1})$ guess from the numerical simulation! Also, using $x=n^{-1/\alpha}\delta_n\sim\gamma\beta\frac{2}{\pi}\log n$ for $\alpha=1$ gives $$ P\sim\frac{1}{\beta\log n}. $$ Hopefully this is all correct, but I can't rule out making an arithmetic slip above. Edit: The limit above for $\alpha > 1$ does not look very good. It appears to be slightly above the plot in the original post. However, even for n=1000, it should not be very close to the asymptotic limit described. Replacing the asymptotic limit for $\delta_n$ by its exact form effective replaces the $n^{1-\alpha}$ term for P by $(n^{1-1/\alpha}-1)^{-\alpha}$. For n=1000 this approximately doubles the calculated number so, the more accurate value for $\delta_n$ moves you further away from the plot. Something has gone wrong somewhere...

Best Answer

Related Solutions

[Math] n equation for the maximum of n random draws from a Gamma distribution

properly scaled, the expectation is the Euler-Mascheroni constant $\approx 0.577$.

Related Question