Solved – Normality identifier in Shapiro-Wilk test


When using the Shapiro-Wilk test, should I look at the p-values or the W values in order to find out the "most" normal value among my different samples (e.g. iq, age, weight — if I'm running Shapiro-Wilk test on iq, age, and weight and want to find out which is the most normal value, should I look at the P-value or the W? And what number should it be close to?)

Best Answer

You're asking for something like an effect size (A "how big?" type question).

P-values don't measure that; at a given value of W, the p-value tends to go down as n goes up.

The Shapiro-Wilk statistic, W, is in some sense a measure of "closeness to what you'd expect to see with normality", akin to a squared correlation (if I recall correctly, the closely related Shapiro-Francia test is actually a squared correlation between the data and the normal scores, while the Shapiro Wilk tends to be slightly larger; I seem to recall that it takes into account correlations between order statistics).

Specifically values closer to 1 indicate "closer to what you'd expect if the distribution the data were drawn from is normal".

However, keep in mind it's a random variable; samples can exhibit random fluctuations that don't represent their populations, and summary statistics will follow suit.

It's not immediately clear that it necessarily makes sense to compare Shapiro-Wilk statistics across data-sets in order to declare one set "more normal" than another; even less so with very different variables and different sample sizes.

Further, choosing the one closest to 1 among a collection of samples may actually be choosing something other than values randomly selected from a normal distribution, for a variety of reasons. For example, goodness of fit tests generally tend to be biased tests; what makes their criterion "closest" isn't necessarily the thing the test is actually designed to pick up. (I don't know what sorts of small-sample biases the Shapiro-Wilk specifically may have, however.)

Finally, I don't see any useful point to such an exercise. What possible value can there be in such a procedure?