Solved – Shapiro-Wilk-Test with p < .05 but data looks normal

normal distributionqq-plotr

I'm working with R since a few months and read the book Discovering Statistics with R by from Andy Field until Chapter 12 by now.

I have some data, which I (for now without any specific reason) want to check for normality.

The data was produced by people filling out a online survey and could check an item on a scale from 1 to 4. So obviously the variable is a discrete one, which is (at least I guess) why I get the strange looking qqplot below. What I don't understand is: My data looks (at least to me) quite like a normal distribution.

The p-value computed by R is still < 0.01 and I don't know why. Is the distribution not normal and everything is right, or is the reason that I only have 4 different values? My sample size is above 70.

enter image description here
QQPlot

Best Answer

It is hard to tell if a variable with only 4 levels is actually continuous; often, people don't consider it unless you get 5 values or perhaps even 7.

Some would consider this, with 4 possible values, ordinal. Is the difference between 1 and 2 the same as the difference between 3 and 4, conceptually?

Your Q-Q plot looks strange because the theoretical distribution can be floats (e.g., 1.14, 3.33, 2.79, etc.) between 1 and 4, whereas your observed values are only integers (i.e., 1, 2, 3, 4).

There is an argument against adhering to p-values for testing for normality, which I would consider if I were you. However, with a small n of just 70 and a significant non-normality... you may have an issue. What test are you trying to run? Luckily, a lot of tests are robust to mild violations of normality.