[Math] Central Limit Theorem (and Berry-Esseen theorem) for non-independent variables

limits-and-convergencepr.probability

Consider the triangular array $X_{n,k}$ such that, for each $n>0$, the variables $(X_{n,1},\cdots,X_{n,n})$ have the following properties:

  1. For any given $1 \le L \le n$, all
    subsets of
    $(X_{n,1},\cdots,X_{n,n})$ of size
    $L$ have the same joint distribution
    (even after applying an arbitrary permutation).
  2. Each $X_{n,k}$ has
    zero mean, variance
    $0<\sigma^2<\infty$, and third
    absolute moment $0<\rho<\infty$
  3. Each $(X_{n,k_1},X_{n,k_2})$ pair
    ($k_1 \ne k_2$) has covariance
    $\frac{-C}{n}$ for some $0 < C < \infty$ (hence the variables are all negatively correlated, and the correlation tends to zero)

Let $S_n = \sum_{k=1}^{n} X_{n,k}$. Is $\frac{S_n}{\sqrt{\mathrm{Var}[S_n]}}$ asymptotically distributed as $N(0,1)$? If so, is a convergence rate of $O(\frac{1}{\sqrt{n}})$ achieved (cf. Berry-Esseen theorem)? What about in the multivariate setting where each $X_{n,k}$ is a random vector (and the relevant quantities above are replaced by vectors/matrices)?

If it helps, we can also assume that the $X_{n,k}$ are uniformly bounded in $n$ and $k$ with probability one. Answers with further assumptions than the ones listed will also be appreciated.

Best Answer

These types of conditions (exchangeability and $O(1/n)$ covariance) do not even ensure that $S_n$ is approximately normal.

Ignore odd $n$. (I believe similar examples can be constructed for odd $n$.) Let $n=2m$. Let each $X_{n,k} \in \lbrace -1,1 \rbrace$. Let each set of $m$ positive signs have probability ${2m \choose m}^{-1}$. Then the sum is always $0$. This satisfies conditions 1 and 2. $P(X_{n,1} = X_{n,2}) = {2m-2 \choose m}/{2m-1 \choose m} = \frac{m-1}{2m-1}$ so the covariance is $\frac{m-1}{2m-1} - \frac{m}{2m-1} = \frac{-1}{2m-1}$.

The point mass at $0$ is still normal, but degenerate. However, this example can be modified slightly so that the sum has positive standard deviation. With probability $\frac{1}{4m^2}$ let all $X_{n,i}$ be equal ($\frac{1}{8m^2}$ chance of all $+1$, $\frac{1}{8m^2}$ all $-1$), and with probability $\frac{4m^2-1}{4m^2}$ let the positive indices be a uniformly random subset of size $m$. Then the covariance $\text{Cov}(X_{n,1},X_{n,2})$ is $\frac{-1}{n}$, and $P(S_n =0) = \frac{n^2-1}{n^2}, P(S_n = \pm n) = \frac{1}{2n^2}$ so $\text{Var}(S_n) = 1$, and $\frac{S_n}{\sqrt{\text{Var}(S_n)}}$ is far from normal.

Related Question