I have a question concerning a claim I read in statistics books concerning the applicability of the t-distribution to compute confidence intervals for large $n$ if the data is not normally distributed (but has finite variance).
The statement is that for small $n$ one should check the Gaussian assumption on the data, but if $n$ is large enough then the t-distribution can be used to compute the confidence intervals. The idea seems to be that the distribution of the mean will converge to a Normal distribution by the central limit theorem.
Here is what I don't understand: A t-distributed RV can be characterized as a Normal RV divided by a $\chi$-distributed RV with $n-1$ degrees of freedom. I see that the distribution of the mean will converge to a Normal distribution, but what about the distribution of the standard error? If the original data is not Normal, why should the standard error follow a $\chi$-distribution? Therefore, why should the standardized RV have a t-distribution?
The only way I could understand this is that for large $n$
-
the distribution of the mean converges to a Normal distribution
-
the t-distribution converges to a Normal distribution
so basically for large $n$ one could use a Gaussian distribution anyway for the confidence interval. However, this somehow does not make the "detour" over the t-distribution.
Or is there another mathematical reason why the mean divided by the standard error should have a t-distribution? Am I missing something? Thanks for your help.
Best Answer
Ok, after the hint of Procrastinator I think this is the answer (please correct me if I missed something).
First of all, $\frac{\overline{X}_n-\mu}{S_n/\sqrt{n}}$ is t-distributed if $\overline{X}_n$ has a Normal distribution, $S_n$ has a $\chi$-distribution with $n-1$ degrees of freedom, and $\overline{X}_n$ and $S_n$ are independent. In that sense, normality of the single $X_i$ that lead to the sample mean $\overline{X}_n$ is not required.
For non-normal data, the distribution of the sample mean converges in distribution to a Normal distribution by the central limit theorem. By the law of large numbers the sample variance $S^2_n$ converges almost surely to the distribution variance $\sigma^2$. Since almost sure convergence is preserved under continuous mappings, the sample standard deviation also converges almost surely to the distribution standard deviation $S_n \rightarrow_{a.s.} \sigma$.
Since almost sure convergence implies convergence in probability, we can now apply Slutsky's theorem which states (for this case) that if $X_n \rightarrow_D X$ and $Y_n \rightarrow_P c$, then $X_n/Y_n\rightarrow_D X/c$. For our case, this would mean that
$$\frac{\overline{X}_n-\mu}{S_n/\sqrt{n}} \rightarrow_D \frac{\overline{X}_n-\mu}{\sigma/\sqrt{n}}.$$
So this means that even if the standard deviation is not $\chi$-distributed the t-statistic will still converge to a Normal distribution by the central limit theorem and Slutsky's theorem. So I guess what my statistics book meant is kind of what I expected
for large $n$ the sample mean converges in distribution to a Normal distribution
for large degrees of freedom the t-distribution converges to a Normal distribution
even if the sample standard deviation is not $\chi$-distributed and not independent of the sample mean, it does not ''disturb'' the convergence of the sample mean distribution to the Normal distribution
Therefore, we use the t-distribution for computing confidence intervals for large $n$ even though it is basically using the Normal distribution. I guess the t-distribution is just used because it is a bit more conservative.