Probability – How to Get IID Samples from Independence Coupling $P_X \otimes P_Y$

it.information-theoryoptimal-transportationpr.probabilityreference-requestsampling

Let $(X,Y)$ be a pair of random variables on a measure space $\mathcal T \subseteq \text{"subsets of }\mathbb R^2\text{"}$, with joint probability distribution $P$.

We don't assume $X$ and $Y$ are independent!

Let $P_X$ (resp. $P_Y$) be the marginal distribution of $X$ (resp. $Y$), defined by $P_X(A) := \int_{p_1(A)} \,dP$, where $\mathcal p_k(A) := \{(z_1,z_2) \in \mathcal T \mid z_k \in A\}$ defines the projection operator unto the $k$ coordinate of $\mathcal T$. Let $\Pi(X,Y)$ be the set of all couplings of $X$ and $Y$, i.e the set of all probability distributions on $\mathcal T$, with same margins as $P$. Finally, let $P_X \otimes P_Y \in \Pi(X,Y)$ be the independence coupling of $X$ and $Y$ defined by
$$
(P_X \otimes P_Y)(U) := P_X(p_1(U))\cdot P_Y(p_2(U)).
$$

Let $k$ and $n$ be positive integers, presumably, with $n \gg k$.

Question. Given $n$ independent copies $(X_1,Y_1),\ldots,(X_n,Y_n)$ of $(X,Y)$ (i.e an iid sample of size $n$ from the joint distribution $P$), what is a principled way to obtain an iid sample from $P_X \otimes P_Y$ of size $k$ ?

Best Answer

It is unclear to me what "a principled way" could mean.

However, given $n$ iid pairs $(X_1,Y_1),\dots,(X_n,Y_n)$, it is easy to get $k:=\lfloor n/2\rfloor$ iid pairs $(X_1,Z_1),\dots,(X_k,Z_k)$ such that, for each $j\in\{1,\dots,k\}$, (i) the random variables $X_j$ and $Z_j$ are independent and (ii) $Z_j$ equals $Y_j$ in distribution:

Just let $Z_j:=Y_{k+j}$ for all $j\in\{1,\dots,k\}$.


If the joint probability distribution $P_{X,Y}$ of the pair $(X,Y)$ is known, then it may be possible to get an iid sample of size $n$ from the distribution $P_X\otimes P_Y$ using an iid sample of (the same) size $n$ from the distribution $P_{X,Y}$. This could be done by applying a transformation $T\colon\mathbb R^2\to\mathbb R^2$ to the pair $(X,Y)$ to get a pair $(U,V):=T(X,Y)$ with distribution $P_X\otimes P_Y$.

The transformation $T$ can apparently be obtained by discrete approximation, as follows. For each natural $n$, let $X_n$ and $Y_n$ be discrete random variables (r.v.'s), say each taking only finitely many values, such that $X_n\to X$ and $Y_n\to X$ in probability (as $n\to\infty$). Then $P_{X_n,Y_n}\to P_{X,Y}$, $P_{X_n}\to P_X$, $P_{Y_n}\to P_Y$, and $P_{X_n}\otimes P_{Y_n}\to P_X\otimes P_Y$ weakly.

Then for each natural $n$ there is a transformation $T_n\colon\mathbb R^2\to\mathbb R^2$ such that the pair $(U_n,V_n):=T_n(X_n,Y_n)$ has the distribution $P_{X_n}\otimes P_{Y_n}$. This follows because any discrete set can be transformed bijectively to a set on the real line.

If now the set $\{T_n\colon n\in\mathbb N\}$ is compact in an appropriate sense, then, passing to a subsequence, from the pairs $(U_n,V_n)=T_n(X_n,Y_n)$ with distributions $P_{X_n}\otimes P_{Y_n}$ one will get a pair $(U,V):=T(X,Y)$ with distribution $P_X\otimes P_Y$.

Related Question