Hypothesis Testing – Choosing Between Z-Test and T-Test Based on Assumptions

Background: I'm giving a presentation to colleagues at work on hypothesis testing, and understand most of it fine but there's one aspect that I'm tying myself up in knots trying to understand as well as explain it to others.

This is what I think I know (please correct if wrong!)

Statistics that would be normal if variance was known, follow a $t$-distribution if the variance is unknown
CLT (Central Limit Theorem): The sampling distribution of the sample mean is approximately normal for sufficiently large $n$ (could be $30$, could be up to $300$ for highly skewed distributions)
The $t$-distribution can be considered Normal for degrees of freedom $> 30$

You use the $z$-test if:

Population normal and variance known (for any sample size)
Population normal, variance unknown and $n>30$ (due to CLT)
Population binomial, $np>10$, $nq>10$

You use the $t$-test if:

Population normal, variance unknown and $n<30$
No knowledge about population or variance and $n<30$, but sample data looks normal / passes tests etc so population can be assumed normal

So I'm left with:

For samples $>30$ and $<\approx 300$(?), no knowledge about population and variance known / unknown.

So my questions are:

At what sample size can you assume (where no knowledge about population distribution or variance) that the sampling distribution of the mean is normal (i.e. CLT has kicked in) when the sampling distribution looks non-normal? I know that some distributions need $n>300$, but some resources seem to say use the $z$-test whenever $n>30$…
For the cases I'm unsure about, I presume I look at the data for normality. Now, if the sample data does looks normal do I use the $z$-test (since assume population normal, and since $n>30$)?
What about where the sample data for cases I'm uncertain about don't look normal? Are there any circumstances where you'd still use a $t$-test or $z$-test or do you always look to transform / use non-parametric tests? I know that, due to CLT, at some value of $n$ the sampling distribution of the mean will approximate to normal but the sample data won't tell me what that value of $n$ is; the sample data could be non-normal whilst the sample mean follows a normal / $t$. Are there cases where you'd be transforming / using a non-parametric test when in fact the sampling distribution of the mean was normal / $t$ but you couldn't tell?

Best Answer

@AdamO is right, you simply always use the $t$-test if you don't know the population standard deviation a-priori. You don't have to worry about when to switch to the $z$-test, because the $t$-distribution 'switches' for you. More specifically, the $t$-distribution converges to the normal, thus it is the correct distribution to use at every $N$.

There is also a confusion here about the meaning of the traditional line at $N=30$. There are two kinds of convergence that people talk about:

The first is that the sampling distribution of the test statistic (i.e., $t$) computed from normally distributed (within group) raw data converges to a normal distribution as $N\rightarrow\infty$ despite the fact that the SD is estimated from the data. (The $t$-distribution takes care of this for you, as noted above.)
The second is that the sampling distribution of the mean of non-normally distributed (within group) raw data converges to a normal distribution (more slowly than above) as $N\rightarrow\infty$. People count on the Central Limit Theorem to take care of this for them. However, there is no guarantee that it will converge within any reasonable sample size--there is certainly no reason to believe $30$ (or $300$) is the magic number. Depending on the magnitude and nature of the non-normality, it can take very long (cf. @Macro's answer here: Regression when the OLS residuals are not normally distributed). If you believe your (within group) raw data are not very normal, it may be better to use a different type of test, such as the Mann-Whitney $U$-test. Note that with non-normal data, the Mann-Whitney $U$-test is likely to be more powerful than the $t$-test, and can be so even if the CLT has kicked in. (It is also worth pointing out that testing for normality is likely to lead you astray, see: Is normality testing 'essentially useless'?)

At any rate, to answer your questions more explicitly, if you believe your (within group) raw data are not normally distributed, use the Mann-Whitney $U$-test; if you believe you data are normally distributed, but you don't know the SD a-priori, use the $t$-test; and if you believe your data are normally distributed and you know the SD a-priori, use the $z$-test.

It may help you to read @GregSnow's recent answer here: Interpretation of p-value in comparing proportions between two small groups in R regarding these issues as well.

Best Answer

Related Solutions

Solved – the difference between Welch T Test and Z-test

T-Test & Z-Test – Ensuring Inference Validity in One-Sample Testing

Related Question