R Data Visualization – How to Use QQPlot to See Whether Data Are Normally Distributed

data visualizationhistogramnormal distributionqq-plotr

I have plotted this after I did a Shapiro-Wilk normality test. The test showed that it is likely that the population is normally distributed. However, how to see this "behaviour" on this plot? enter image description here

UPDATE

A simple histogram of the data:

enter image description here

UPDATE

The Shapiro-Wilk test says:

enter image description here

Best Answer

"The test showed that it is likely that the population is normally distributed."

No; it didn't show that.

Hypothesis tests don't tell you how likely the null is. In fact you can bet this null is false.

The Q-Q plot doesn't give a strong indication of non-normality (the plot is fairly straight); there's perhaps a slightly shorter left tail than you'd expect but that really won't matter much.

The histogram as-is probably doesn't say a lot either; it does also hint at a slightly shorter left tail. But see here

The population distribution your data are from isn't going to be exactly normal. However, the Q-Q plot shows that normality is probably a reasonably good approximation.

If the sample size was not too small, a lack of rejection of the Shapiro-Wilk would probably be saying much the same.

Update: your edit to include the actual Shapiro-Wilk p-value is important because in fact that would indicate you would reject the null at typical significant levels. That test indicates your data are not normally distributed and the mild skewness indicated by the plots is probably what is being picked up by the test. For typical procedures that might assume normality of the variable itself (the one-sample t-test is one that comes to mind), at what appears to be a fairly large sample size, this mild non-normality will be of almost no consequence at all -- one of the problems with goodness of fit tests is they're more likely to reject just when it doesn't matter (when the sample size is large enough to detect some modest non-normality); similarly they're more likely to fail to reject when it matters most (when the sample size is small).

Related Question