[Math] Central Limit Theorem and sum of squared random variables

This is a two-part question.

Suppose I am drawing random variables $X_i\sim A$, $1\leq i \leq n$ where $A$ is a zero-mean, finite variance $\sigma_A^2$, symmetric probability distribution having finite fourth moment $\mathbb{E}(X^4)$ over the support of real number line. Assume that for all $(i,j)$, $X_i$ and $X_j$ are independent. I am interested in approximating the distribution of the sum of squares $\sum_{i=1}^nX_i^2$ with the normal distribution, for very large $n$. In the second part of the question I relax the assumption that $X_i$'s are identically-distributed, but keep same conditions on each $A_i$.

By Central Limit Theorem (CLT), the sum of these i.i.d. random variables, the random variable
$$\frac{\sum_{i=1}^nX_i}{\sqrt{n\sigma_A^2}}\xrightarrow{D}\mathcal{N}(0,1)$$

where $\xrightarrow{D}$ denotes convergence in distribution. Thus, I can approximate the distribution of the sum by $\mathcal{N}(0,n\sigma_A^2)$ for large enough $n$.

The first part of my question is: what distribution can I use to approximate the sum of the squares of a large number of these i.i.d. random variables $\sum_{i=1}^n X^2_i$? Do a function of it converge to a standard Gaussian in distribution (i.e. given large enough, possibly infinite $n$)? I understand that if $A$ is a Gaussian, then $\frac{1}{\sigma_A^2}\sum_{i=1}^n X^2_i\sim\chi^2(n)$, which can be approximated by $\mathcal{N}(n\sigma_A^2,2n\sigma_A^4)$ for very large $n$ using the asymptotic properties of chi-squared distribution (where $\sigma_A^4$ denotes squared variance.) But what happens when $A$ has the nice properties described above, but is not necessarily Gaussian?

My intuition tells me that that it should converge to a Gaussian, since we are still dealing with a sum of random variables with finite mean and variance. But I'm not sure how to prove that or characterize the distribution in terms of $n$ and $\sigma_A^2$.

Second part of the question is a further generalization on this topic. Now suppose $X_i\sim A_i$ are non-identically distributed. They are all still independent, and $A_i$ are still zero-mean and symmetric, but they all have different finite variances $\sigma_i^2$, and may have a different form. Since the means and variances are finite, Lindeberg's condition is met, which assures us that CLT holds for the $\frac{\sum_{i=1}^n X_i}{\sqrt{\sum_{i=1}^n\sigma_i^2}}\xrightarrow{D}\mathcal{N}(0,1)$. However, again, I am wondering what happens with the sum of squares $\sum_{i=1}^n X_i^2$. Is there a function of it that converges to a nice random variable such as Gaussian in distribution (i.e. for an appropriately large $n$, possibly infinite, does it look Gaussian) If so, to what distribution does it converge to and how can one characterize both the distribution and the function of $\sum_{i=1}^n X_i$ in terms of $n$ and $\sigma_i^2$, and possibly $\mathbb{E}(X_i^4)$? Is the result more attainable if each $A_i$ is a zero-mean Gaussian with variance $\sigma_i^2$?

Again, my intuition tells me that a function of $\sum_{i=1}^n X_i^2$ should converge to a Gaussian, since again we are dealing with the sum of random variables with finite means and variances, which should meet Lindeberg's condition… but is there a proof and how to characterization of this distribution in terms of $n$ and $\sigma_i^2$?

EDITS: I have changed the question after @Michael Hardy answered the first part for me. The second part is still open…

Best Answer

I have a certain degree of discomfort with the expression $$\sum_{i=1}^n X^2_i\xrightarrow{D}\mathcal{N}(n,2n\sigma_A^4)$$ since $n$ appears on both sides. If one takes a limit as $n\to\infty$, one gets something that does not depend on $n$.

When one says $$\sum_{i=1}^n X_i\sim\mathcal{N}(0,n\sigma_A^2),$$ it has to mean that $$ \frac{1}{\sigma_A\sqrt{n}}\sum_{i=1}^n X_i $$ converges in distribution to $\mathcal{N}(0,1)$ as $n\to\infty$, and no $n$ appears in the expression "$\mathcal{N}(0,1)$", which is the limit.

Since $\mathbb{E}(X_i) = 0$, we have $\sigma_A^2=\operatorname{var}(X_i)=\mathbb{E}(X_i^2)$, and $$ \operatorname{var}(X^2) = \mathbb{E}(X^4) - \sigma_A^4. $$

So if this last quantity happens to be finite then the central limit theorem tells us that $$ \frac{\sum_{i=1}^n (X_i^2 - \sigma_A^2) }{\sqrt{n}\sqrt{\mathbb{E}(X^4) - \sigma_A^4}} $$ converges in distribution to $\mathcal{N}(0,1)$ as $n\to\infty$.

Best Answer

Related Solutions

[Math] Central limit theorem confusion

[Math] Distribution of the sum of squared independent normal random variables.

Related Question