Confidence Intervals – How to Use T-Distribution Confidence Intervals for Non-Gaussian Data with Large Sample Size?

confidence intervalmathematical-statisticst-distribution

I have a question concerning a claim I read in statistics books concerning the applicability of the t-distribution to compute confidence intervals for large $n$ if the data is not normally distributed (but has finite variance).

The statement is that for small $n$ one should check the Gaussian assumption on the data, but if $n$ is large enough then the t-distribution can be used to compute the confidence intervals. The idea seems to be that the distribution of the mean will converge to a Normal distribution by the central limit theorem.

Here is what I don't understand: A t-distributed RV can be characterized as a Normal RV divided by a $\chi$-distributed RV with $n-1$ degrees of freedom. I see that the distribution of the mean will converge to a Normal distribution, but what about the distribution of the standard error? If the original data is not Normal, why should the standard error follow a $\chi$-distribution? Therefore, why should the standardized RV have a t-distribution?

The only way I could understand this is that for large $n$

  • the distribution of the mean converges to a Normal distribution

  • the t-distribution converges to a Normal distribution

so basically for large $n$ one could use a Gaussian distribution anyway for the confidence interval. However, this somehow does not make the "detour" over the t-distribution.

Or is there another mathematical reason why the mean divided by the standard error should have a t-distribution? Am I missing something? Thanks for your help.

Best Answer

Ok, after the hint of Procrastinator I think this is the answer (please correct me if I missed something).

First of all, $\frac{\overline{X}_n-\mu}{S_n/\sqrt{n}}$ is t-distributed if $\overline{X}_n$ has a Normal distribution, $S_n$ has a $\chi$-distribution with $n-1$ degrees of freedom, and $\overline{X}_n$ and $S_n$ are independent. In that sense, normality of the single $X_i$ that lead to the sample mean $\overline{X}_n$ is not required.

For non-normal data, the distribution of the sample mean converges in distribution to a Normal distribution by the central limit theorem. By the law of large numbers the sample variance $S^2_n$ converges almost surely to the distribution variance $\sigma^2$. Since almost sure convergence is preserved under continuous mappings, the sample standard deviation also converges almost surely to the distribution standard deviation $S_n \rightarrow_{a.s.} \sigma$.

Since almost sure convergence implies convergence in probability, we can now apply Slutsky's theorem which states (for this case) that if $X_n \rightarrow_D X$ and $Y_n \rightarrow_P c$, then $X_n/Y_n\rightarrow_D X/c$. For our case, this would mean that

$$\frac{\overline{X}_n-\mu}{S_n/\sqrt{n}} \rightarrow_D \frac{\overline{X}_n-\mu}{\sigma/\sqrt{n}}.$$

So this means that even if the standard deviation is not $\chi$-distributed the t-statistic will still converge to a Normal distribution by the central limit theorem and Slutsky's theorem. So I guess what my statistics book meant is kind of what I expected

  • for large $n$ the sample mean converges in distribution to a Normal distribution

  • for large degrees of freedom the t-distribution converges to a Normal distribution

  • even if the sample standard deviation is not $\chi$-distributed and not independent of the sample mean, it does not ''disturb'' the convergence of the sample mean distribution to the Normal distribution

Therefore, we use the t-distribution for computing confidence intervals for large $n$ even though it is basically using the Normal distribution. I guess the t-distribution is just used because it is a bit more conservative.

Related Question