Are min$(X_1,\ldots,X_n)$ and min$(X_1Y_1,\ldots,X_nY_n)$ independent for $n$ to infinity

extreme-value-analysisprobabilitystatistics

This is a question that I posted on stats.stackexchange.com but since I received no satisfying answer but still the question was upvoted by many, I want to use the oppurtunity to further extend the question and hopefully address a larger audience; The original question can be found here:

https://stats.stackexchange.com/questions/432396/are-minx-1-ldots-x-n-and-minx-1y-1-ldots-x-ny-n-independent-for-n-to

Assume that we have given two continuous iid random variables $X$ and $Y$ with support $[1,c)$, where $c$ is some constant greater than one. (The exact value is probably unimportant anyway)
Now assume I have a given iid sample $X_1, \ldots,X_n$ and $Y_1, \ldots,Y_n$ (so absolutely no dependence here).

Imagine that I know that:

$$(1): \mathbb P \left(\frac{\min(X_1,\ldots,X_n)-a_n}{b_n}\leq x_1\right) \sim F(x_1), \text{ for }n \to \infty,$$

where $F(x_1)$ is some non-degenerate cdf; Given some weak condition, it is usually quite easy to derive sequences $a_n$, $b_n$ and the limit distribution $F$, since it is very much connected to Extreme Value theory; Moreover, I know

$$(2):\mathbb P \left(\frac{\min(X_1Y_1,\ldots,X_nY_n)-\bar a_n}{\bar b_n}\leq x_2\right) \sim G(x_2), \text{ for }n \to \infty,$$

Is it true that then it also follows that

$$(3):\mathbb P \left(\frac{\min(X_1,\ldots,X_n)-a_n}{b_n}\leq x_1,\frac{\min(X_1Y_1,\ldots,X_nY_n)-\bar a_n}{\bar b_n}\leq x_2\right) \sim F(x_1) G(x_2),$$

for $n$ to infinity?

At first I thought this cannot work since $X$ and $XY$ are obviously absolutely dependent; But then I thought the following:

  1. The probability that the minimum of $X_1,\ldots,X_n$ and the minimum $X_1Y_1,\ldots,X_nY_n$ is obtained in the same realization converges to zero for n to infinity
  2. Since the sample itself is iid, the minima should be kinda independent;

So I don't know if this is true; Unfortunately, I cannot think of a counterexample and I also have no idea how to prove it; The only thing I thought of to prove 1. is:

The probability, that the minimum is obtained in the same realization is given by

\begin{align*}
&\sum_{i=1}^n\mathbb P\big(X_i=\min(X_1,\ldots,X_n), X_iY_i=\min(X_1Y_1,\ldots,X_nY_n)\big) \\ \leq &\sum_{i=1}^n\mathbb P\big(X_i=\min(X_1,\ldots,X_n), Y_i \leq \min(X_1Y_1,\ldots,X_nY_n)\big) \\
= &n \cdot \mathbb P\big(X_i=\min(X_1,\ldots,X_n)\big) \mathbb P\big( Y_i \leq \min(X_1Y_1,\ldots,X_nY_n) \vert X_i=\min(X_1,\ldots,X_n) \big) \\
= &n \cdot 1/n \mathbb P\big( Y_i \leq \min(X_1Y_1,\ldots,X_nY_n) \vert X_i=\min(X_1,\ldots,X_n) \big)
\end{align*}

where the latter probability converges to zero, since $\min(X_1Y_1,\ldots,X_nY_n)$ gets arbitrarily close to 1 for $n \to \infty$ (and the condition does not seem to change that). Therefore, the probability, that the minimum is realized in the same observation is something like $n \cdot 1/n \cdot o(1)=o(1)$, so converges to zero…

Now this is obviously not a rigorous proof; So is there anyone smarter or more knowledgeable than me with an idea of a proof or a counterexample why my idea is wrong?


Since antkam was commenting something, I want to give a brief look into Extreme Value Theory:

Most people probably know the central limit theorem:

$$\frac{S_n-n\mu}{\sqrt n \sigma} \xrightarrow[]{D}N(0,1) $$

So something similar, we can get for maxima which is based on Extreme Value Theory; It is known that

$$\frac{\vee X-a_n}{b_n}$$

can only converge to one of the three extreme value distributions (or to some constant), where $\vee X= \max(X_1,\ldots,X_n)$. So for minima, we can use this and also find limit distributions and those sequences, such that

$$\frac{\land X-a_n}{b_n}$$ converges to some distribution; Now what you, antkam, probably wanted to say:

If $\frac{\land XY−\bar a_n}{\bar b_n}\leq x_2$, then it holds that $\land XY \leq x_2 \bar b_n+\bar a_n$ and therefore in particular it holds that: $\land X \leq x_2 \bar b_n+\bar a_n$ (Since $Y \geq 1$) Now $\frac{\land X− a_n}{ b_n}\leq x_1$ is equivalent to $\land X \leq x_1 b_n + a_n$.

So if $x_2 \bar b_n+\bar a_n \leq x_1 b_n+ a_n$, then $\frac{\land XY−\bar a_n}{\bar b_n}\leq x_2$ already implies $\frac{\land X− a_n}{ b_n}\leq x_1$;

That was by the way also a way, someone wanted to prove that this in incorrect; Therefore, if we solve it for $x_1$ we get:

$$ x_1 \geq\frac{ x_2 \bar b_n+\bar a_n- a_n}{b_n}$$

So if $x_1$ is greater than the term, then it certainly is incorrect; The problem is, that we do not know these sequences and therefore, the right term could (and probably will) go to infinity and then it does not work, obviously. So if you want to prove it like that, there are basically just 2 ways I can think of:

  1. You take some distribution for $X$ and $Y$, calculate the corresponding sequences and show that the right term indeed does not go to infinity;

  2. You take some other sequences $a_n$, $b_n$, $\bar a_n$, $\bar b_n$(I never said that it only works for the sequences such that we can get some nice limit distribution) but then the limit of $\frac{\land X-a_n}{b_n}$ would somewhat converge in probability only to a fixed number, or it goes to infinity or -infinity or something like that; So the cdf of $\frac{\land X-a_n}{b_n}$ or also $\frac{\land XY-\bar a_n}{\bar b_n}$ would only have values 0 and 1…

Hope this helps some people to understand the problem a bit better… 🙂


@ Sangchul Lee: Thank you very much, for your answer; This is actually very interesting, because given the uniform distribution,then $X-1$ and $\ln(X)$ is regularly varying at zero with exponent $\alpha=1$;

This is equivalent to $1/(X-1)$ or $1/\ln(X)$ being regularly varying at infinity with exponent $\alpha=-1$; Using well-known results, we can show that $\frac{\lor (1/\ln(X))}{b_n}$ then converges to a Frechet distribution with exponent $\alpha$ and by the close connection we know that

$$\mathbb P\left(\frac{\land \ln(X)}{b_n}\leq x\right)$$

or also

$$\mathbb P\left(\frac{\land X-1}{b_n}\leq x\right)$$

converges to $(1-\phi_{\alpha}(1/x))$ and since $XY-1$ is regularly varying with exponent $2\alpha$, you get this convergence with $e^{-t^2}$ for the tail;

Which brings me back to your idea; I suspect that it might be possible to prove it more generally for all regularly varying random variables, but I need to think about it tomorrow…

But it is definitely a very interesting and smart answer, thank you so much for it… 🙂


Okay, I give things a try, although I did not manage to use the regular variation and I have no idea how I could use it; Also, I am not sure if this is a proper proof; The only thing I am going to use is that $\mathbb P(Y \in [1,a])\to 0$ for $a \downarrow 1$ and $\mathbb P(X \in [1,a])\to 0$ for $a \downarrow 1$. (Does this follow from the continuity or can we construct some weird distribution that is continous but still does not satisfy this?)

Anyway: We generally assume $X$ and $Y$ to be independent but not necessarily identically distributed; We know that

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{\land X-1}{b_n}> x_1 \right) = \exp(-x_1^{\alpha_1})$$

and

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{\land XY-1}{\bar b_n}> x_2 \right) = \exp(-x_2^{\alpha_2})$$

So these are our assumptions; Then it follows that:

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{\land X-1}{b_n}> x_1 \right) =\lim\limits_{n \to \infty} \mathbb P \left(\frac{ X-1}{b_n}> x_1 \right)^n= \exp(-x_1^{\alpha_1})$$

and therefore, in particular:

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{ X-1}{b_n}> x_1 \right)=\left(1-\frac{x_1^{\alpha_1}}{n}\right)$$

respectively

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{ X-1}{b_n}\leq x_1 \right)=\frac{x_1^{\alpha_1}}{n}$$

and also it holds that

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{ XY-1}{\bar b_n}> x_2 \right)=\left(1-\frac{x_2^{\alpha_2}}{n}\right)$$

respectively

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{ XY-1}{\bar b_n}\leq x_2 \right)=\frac{x_2^{\alpha_2}}{n}$$

Now we have:

$$\lim\limits_{n \to \infty}\mathbb P \left( \frac{\land X-1}{ b_n}> x_1 \right) \mathbb P \left(\frac{\land XY-1}{\bar b_n}> x_2 \right)=\exp(-x_1^{\alpha_1})\exp(-x_2^{\alpha_2})=\exp(-x_1^{\alpha_1}-x_2^{\alpha_2})$$

and

$$\lim\limits_{n \to \infty}\mathbb P \left(\frac{ X-1}{ b_n}> x_1,\frac{ XY-1}{\bar b_n}> x_2 \right)\\
=\lim\limits_{n \to \infty}\left(1- \mathbb P \left(\frac{ X-1}{ b_n}\leq x_1\right)-\mathbb P \left(\frac{ XY-1}{\bar b_n}\leq x_2\right)+\mathbb P \left(\frac{ X-1}{ b_n}\leq x_1,\frac{ XY-1}{\bar b_n}\leq x_2\right)\right)\\
=\lim\limits_{n \to \infty}\left(1-\frac{x_1^{\alpha_1}}{n}-\frac{x_2^{\alpha_2}}{n} +\mathbb P \left(\frac{ X-1}{ b_n}\leq x_1,\frac{ XY-1}{\bar b_n}\leq x_2\right)\right)$$

Moreover, we have

$$\mathbb P \left(\frac{ X-1}{ b_n}\leq x_1,\frac{ XY-1}{\bar b_n}\leq x_2\right) = \mathbb P \left(\frac{ X-1}{ b_n}\leq x_1\right) \mathbb P \left(\frac{ XY-1}{\bar b_n}\leq x_2 \bigg \vert \frac{ X-1}{ b_n}\leq x_1\right) \\
\leq \frac{x_1^{\alpha_1}}{n} \mathbb P \left(\frac{Y-1}{\bar b_n}\leq x_2 \bigg \vert \frac{ X-1}{ b_n}\leq x_1\right)=\frac{x_1^{\alpha_1}}{n} \mathbb P \left(\frac{Y-1}{\bar b_n}\leq x_2\right)=\frac{x_1^{\alpha_1}}{n} o(1), $$

since $\bar b_{n} \to 0$ for $n \to \infty$. Therefore we can see that

$$\lim\limits_{n \to \infty}\mathbb P \left(\frac{\land X-1}{ b_n}>x_1,\frac{ XY-1}{\bar b_n}> x_2 \right)\\
=\lim\limits_{n \to \infty}\left(1-\frac{x_1^{\alpha_1}}{n}(1-o(1))-\frac{x_2^{\alpha_2}}{n} \right)^n = \exp\left(-x_1^{\alpha_1}-x_2^{\alpha_2}\right)$$

Now I am curious: Is this a valid proof?

Best Answer

Here is a partial answer when both $X_k$ and $Y_k$ are uniformly distributed in $[1, 2]$. I guess it generalizes to a broader class of distribution without much hassle.


We begin by noting that, for each $s, t \geq 0$, define

\begin{align*} \newcommand{\Area}{\operatorname{Area}} \mathcal{A}_n(s) &= \Bigl\{ (x, y) \in [1, 2]^2 : x < 1 + \frac{s}{n} \Bigr\}, \\ \mathcal{B}_n(s) &= \Bigl\{ (x, y) \in [1, 2]^2 : xy < 1 + \sqrt{\frac{2}{n}} \, t \Bigr\}. \end{align*}

Then it follows that $\Area(\mathcal{A}_n(s)) \sim \frac{s}{n}$ and $\Area(\mathcal{B}_n(s)) \sim \frac{t^2}{n}$, and so, we get

\begin{align*} \mathbb{P} \biggl( \frac{\min\{X_1,\cdots,X_n\}-1}{1/n} \geq s \biggr) &= \mathbb{P}\bigl((X_k, Y_k) \notin \mathcal{A}_n(s) \text{ for all } k = 1, \cdots, n\bigr) \\ &= \biggl( 1 - \Area(\mathcal{A}_n(s)) \biggr)^n \xrightarrow[n\to\infty]{} e^{-s} = 1 - F(s) \end{align*}

and similarly

\begin{align*} \mathbb{P} \biggl( \frac{\min\{X_1 Y_1,\cdots,X_n Y_n\}-1}{\sqrt{2/n}} \geq t \biggr) &= \mathbb{P}\bigl((X_k, Y_k) \notin \mathcal{B}_n(t) \text{ for all } k = 1, \cdots, n\bigr) \\ &= \biggl( 1 - \Area(\mathcal{B}_n(t)) \biggr)^n \xrightarrow[n\to\infty]{} e^{-t^2} = 1 - G(t). \end{align*}

Finally, it follows that the probability of the joint event is

\begin{align*} &\mathbb{P} \biggl( \biggl\{ \frac{\min\{X_1,\cdots,X_n\}-1}{1/n} \geq s \biggr\} \cap \biggl\{ \frac{\min\{X_1 Y_1,\cdots,X_n Y_n\}-1}{\sqrt{2/n}} \geq t \biggr\} \biggr) \\ &= \mathbb{P}\bigl((X_k, Y_k) \notin \mathcal{A}_n(s) \cup \mathcal{B}_n(t) \text{ for all } k = 1, \cdots, n\bigr) \\ &= \biggl( 1 - \Area(\mathcal{A}_n(s)) - \Area(\mathcal{B}_n(t)) + \Area(\mathcal{A}_n(t) \cap \mathcal{B}_n(t)) \biggr)^n. \end{align*}

But it is easy to check that $\Area(\mathcal{A}_n(t) \cap \mathcal{B}_n(t)) = \mathcal{O}(n^{-3/2})$, hence the above converges to

\begin{align*} = \biggl( 1 - \frac{s+o(1)}{n} - \frac{t^2+o(1)}{n} + \mathcal{O}(n^{-3/2}) \biggr)^n \xrightarrow[n\to\infty]{} e^{-s-t^2} = (1 - F(s))(1 - G(t)). \end{align*}

Therefore the limiting joint distribution factors and the marginal distributions are independent.

Related Question