Hypothesis Testing – Why Large Sample Sizes Always Show Significant Results Unless True Effect Size Is Zero

hypothesis testing

I am curious about a claim made in Wikipedia's article on effect size.
Specifically:

[…] a non-null statistical comparison will always show a statistically
significant results unless the population effect size is exactly zero

I am not sure what this means/implies, let alone an argument to back it up. I guess, after all, an effect is a statistic, i.e., a value calculated from a sample , with its own distribution. Does this mean that effects are never due
to just random variation (which is what I understand it means to not be significant)? Do we then just consider whether the effect is strong enough — having high absolute value?

I am considering the effect I am most familiar with: the Pearson correlation coefficient r seems to contradict this. Why would any $r$ be statistically-significant? If $r$ is small our regression line
$$ y=ax+b = r\left(\frac {s_y}{s_x}\right) = \epsilon x+b $$

For $\epsilon$ small,is close to 0, an F-test will likely contain a confidence interval containing 0 for the slope. Isn't this a counterexample?

Best Answer

As a simple example, suppose that I am estimating your height using some statistical mumbo jumbo.

You've always stated to others that you are 177 cm (about 5 ft 10 in).

If I were to test this hypothesis (that your height is equal to 177 cm, $h = 177$), and I could reduce the error in my measurement enough, then I could prove that you are not in fact 177 cm. Eventually, if I estimate your height to enough decimal places, you would almost surely deviate from the stated height of 177.00000000 cm. Perhaps you are 177.02 cm; I only have to reduce my error to less than .02 to find out that you are not 177 cm.

How do I reduce the error in statistics? Get a bigger sample. If you get a large enough sample, the error gets so small that you can detect the most minuscule deviations from the null hypothesis.