Solved – choose to use only Shapiro-Wilk


I did a normality test for my data and as usual, SPSS will give both Shapiro-Wilk and Kolmogorov-Smirnov numbers. Is it okay if I choose to use only the number from Shapiro-Wilk? because the number from SW is the ones that showed significant level (0.183) while KS is not (0.046). is there any references that I can use to support my decision as to using only SW when both SW and KS were given? thanks.

Best Answer

  1. If you want a formal hypothesis test of some hypothesis, you should use one test to test that hypothesis.

    You should choose that test before you see data, not after you have results in front of you.

    You should normally choose that test so that it gives the best power against the alternatives that matter to you. If you're looking at data (let alone p-values) it's already too late to do this cleanly. (This is an argument against those packages that just present a laundry list of tests as a matter of course -- they directly encourage p-hacking -- consciously or unconsciously there will be a tendency to focus on the result you were looking for. Better, I think is the packages design philosophy that gives you tests that you ask for, so that you at least make a conscious decision about what you're going to test and when.

  2. It's easy (before the fact) to justify using the Shapiro-Wilk -- it's generally more powerful than most of the competitors, including what SPSS is calling the Kolmogorov-Smirnov, but which I assume is actually Lilliefors' test (because the actual Kolmogorov-Smirnov test is not a test of general normality -- it's not clear why they'd choose to erase Lilliefors' contribution).

  3. If you're actually trying to check suitability of assumptions of some other procedure, formal hypothesis testing is generally unsuitable.

    Firstly, see Is normality testing essentially useless -- especially the answer by Harvey.

    Secondly, if you're choosing between different procedures (such as one that assumes normality if you fail to reject and doesn't assume normality if you do) on the basis of a test of normality, you impact the properties (significance level and power) of both the alternatives you're choosing between and the result is not necessarily what you might hope for. Typically if you're not comfortable justifying a choice of a normal-theory procedure before you see the data you should probably just use a test that either doesn't depend on that assumption at all or at least something that's pretty robust to it (and it's not just level-robustness that matters, though you'd hardly guess from many discussions of robustness of tests).

  4. The phrasing in "the number from SW is the ones that showed significant level, while KS is otherwise" is unclear. If you actually mean that the Shapiro-Wilk would reject the null while the other test would not (or vice versa), using that significance or non-significance as a reason to choose the test is unambiguously p-hacking. If you're choosing between tests post hoc on the basis of whether they rejected or didn't reject, you have to toss out the p-values you're looking at because they no longer mean much of anything; if you present the results as if you had just run one test, you're misleading the people who read your work.

  5. I note from your previous question that $n=5$. That's not much to go on, power may be pretty low against some kinds of alternatives that could matter; with such a small sample like that neither a rejection nor a non-rejection is particularly informative (if we entertain seriously the possibility that the null can be true (the population could actually be normal), the power of the Kolmogorov-Smirnov may be so low that a rejection may be fairly likely to just represent type I error).

    If there's no good reason to anticipate normality, unless you have a procedure that's quite robust to the assumption, I'd be inclined to avoid assuming it.