I'd like to get some real practice with hypothesis testing that is above what's in textbooks (plug and chug, usually). I was hoping someone could suggest some good data sets and problems to work on. Problems that I could show others that I've worked on. Along those lines, I'd very much appreciate any top notch references on the topic.
Solved – Data sets and problems for learning hypothesis testing
hypothesis testingreferences
Related Solutions
It is not true. If the null hypothesis is true then it will not be rejected more frequently at large sample sizes than small. There is an erroneous rejection rate that's usually set to 0.05 (alpha) but it is independent of sample size. Therefore, taken literally the statement is false. Nevertheless, it's possible that in some situations (even whole fields) all nulls are false and therefore all will be rejected if N is high enough. But is this a bad thing?
What is true is that trivially small effects can be found to be "significant" with very large sample sizes. That does not suggest that you shouldn't have such large samples sizes. What it means is that the way you interpret your finding is dependent upon the effect size and sensitivity of the test. If you have a very small effect size and highly sensitive test you have to recognize that the statistically significant finding may not be meaningful or useful.
Given some people don't believe that a test of the null hypothesis, when the null is true, always has an error rate equal to the cutoff point selected for any sample size, here's a simple simulation in R
proving the point. Make N as large as you like and the rate of Type I errors will remain constant.
# number of subjects in each condition
n <- 100
# number of replications of the study in order to check the Type I error rate
nsamp <- 10000
ps <- replicate(nsamp, {
#population mean = 0, sd = 1 for both samples, therefore, no real effect
y1 <- rnorm(n, 0, 1)
y2 <- rnorm(n, 0, 1)
tt <- t.test(y1, y2, var.equal = TRUE)
tt$p.value
})
sum(ps < .05) / nsamp
# ~ .05 no matter how big n is. Note particularly that it is not an increasing value always finding effects when n is very large.
How about Testing Statistical Hypotheses by Lehmann and Romano? The third edition is 786 pages at the PhD statistics level. This book covers both small and large sample theory at a fairly rigorous level. It also introduces some resampling methods, such as the bootstrap. It covers multiple comparisons and goodness of fit testing. Finally, there are over 700 problems.
Best Answer
You are supposed to form your hypotheses before seeing any of the actual data. These hypotheses come from some sort of conceptual frame work. Your best bet may be to form a hypothesis on a topic of interest to you and then try to find a data set to test that hypothesis.