Shapiro Wilk Test – Is the Shapiro Wilk Test W an Effect Size?

effect-sizehypothesis testingnormality-assumption

I want to avoid misusing normality tests where a large enough sample size will highlight any slight non-normality. I want to be able to say that a distribution is "normal enough".

When the population is non-normal the p-value for the Shapiro-Wilk test tends to 0 as the sample size increases. The p-value isn't helpful in deciding if a distribution is "normal enough".

I think a solution would be to measure the effect size of the non-normality and reject anything which is more non-normal than a threshold.

The Shapiro Wilk test produces a test statistic $W$. Is this a way to measure the effect size of the non-normality?

I tested this in R by doing a shapiro wilk test on samples drawn from a uniform distribution. The number of samples ranged from 10 to 5000, the results are plotted below. The value of W does converge to a constant, it doesn't tend towards $1$. I'm unsure if $W$ is biased for small samples, it seems to be low for small sample sizes. If $W$ is a biased estimate of effect size that could be a problem if I want to accept anything under $W=0.1$ as "normal enough".

My two questions are:

  1. Is $W$ a measure of effect size of non-normality?

  2. Is $W$ biased for small sample sizes?

plot

Best Answer

As you know, $W$ is a test statistic. In most cases (all consistent tests), a test statistic is not a suitable effect estimator as the statistic reflects the sample size whereas the effect estimator shall be independent of it. Just think of an asymptotic test to test zero mean under the central limit theorem: The approximate distribution is the same for all $n$, so the test statistic contains even all the information about the sample size. That makes the test statistic unsuitable as effect estimater.

For $W$, it is similar (although the approximate distribution depends on the sample size as well). The lower bound for $W$ is $\frac{a_1^2n}{(n-1)}$, where $a_1$ depends on is the expectation for the smallest order statistic.

So no, it is no suitable effect estimator at all.

In fact, I think you are not yet sure what you are looking for as the term "effect" is a bit more difficult than in the usual parametric world of one-dimensional parameters. Here, the raw effect of a.s. not being normally distributed is infinite dimensional: Each measurable subset of $\mathbb{R}$ can have a different probability from the normal distribution model. For a one-dimensional effect, you need to weight it somehow and be aware of the consequences of various weights to your intended application. This way you would decide if e.g. a certain bimodal distribution with Gaussian tails is more normal than a certain unimodal distribution with heavy tails. In fact trading the tail behaviour against the non-tail behaviour might be the most relevant question to invent a suitable effect.

Then, if will be much easier to find an estimator for this particular effect.

Related Question