MATLAB: Kstest – normal

confusionkstestMATLABnormalnormalityStatistics and Machine Learning Toolbox

Hi, I am confused from reading the description from the 'kstest' function. Usually '1' means true and '0' means false, and the purpose of this function is to test whether or not a set of data is normally distributed. However, what I gather from reading the description, '0' is returned when the data is normally distributed, and '1' is returned when the data is not normally distributed.
Is this correct interpretation? The example is also a little confusing x = -2:1:4 x = -2 -1 0 1 2 3 4
[h,p,k,c] = kstest(x,[],0.05,0)
h =
0
p =
0.13632
k =
0.41277
c =
0.48342
These data are linear, not a normal distribution. Yet the kstest returns '0', which means the kstest classifies these data as normal, which is a limitation of the kstest with small data samples?
From what I read, the resolution is thus to use the 'smaller' or 'larger' tag to correct for this problem, but is there any clear cut-off for what is 'smaller' and what is 'larger'?
Lastly, if I were to use this test in a publication and say that our data was 'normal' (this function returned 0) or failed to be classified as 'normal' (this function returned 1) with this test and I used the 'smaller' or 'larger' tags, how does that change the name of the test? It can't be the same test if it is returning different values. How would I explain this?

Best Answer

Your example (taken from the documentation), "illustrates the difficulty of testing normality in small samples." If you plot
normplot(x)
you'll see that the deviations from a standard normal distribution occur in the two outer points. It doesn't take a lot more data to get a reasonable result, though:
x = -2:0.5:4;
[h,p,k,c] = kstest(x,[],0.05,0)
h =
1
p =
0.0245
k =
0.3947
c =
0.3614
Keep in mind, too, their comment about the Lilliefors test - it is more likely to be the one you want.