Hypothesis Testing – Addressing Kolmogorov-Smirnov Test Failures for Normal Distribution Samples

hypothesis testingkolmogorov-smirnov testnormality-assumptionr

I created a sample with 10000 normally distributed numbers. Subsequently, I used the Kolmogorov-Smirnov test to check if they are indeed normally distributed, and it turned out that they are not. How is this possible?

Below is my code.

data <- rnorm(n=10000, 5, 2)
ks.test(data, "pnorm")

And this is the answer:

Exact one-sample Kolmogorov-Smirnov test

data: data
D = 1, p-value < 2.2e-16
alternative hypothesis: two-sided

Best Answer

As highlighted in the comments (Alex J and COOLSerdash), there are two issues here. First, the model used under the KS test is different from the true model that generated the data. The correct way would be either

> set.seed(12)
> set.seed(30823)
> data <- rnorm(n=10000, 5, 2)
> ks.test(data, "pnorm", mean=5, sd=2)

    Asymptotic one-sample Kolmogorov-Smirnov test

data:  data
D = 0.0044899, p-value = 0.9877
alternative hypothesis: two-sided

> data1 <- rnorm(n=10000)
> ks.test(data1, "pnorm")

    Asymptotic one-sample Kolmogorov-Smirnov test

data:  data1
D = 0.01079, p-value = 0.1947
alternative hypothesis: two-sided

Second (a minor issue), the test if used at level 0.05 has still (approximately) 5% of a chance to reject the null even if the null is true.

Best Answer

Related Solutions

Kolmogorov-Smirnov Test – Proper Use with dgof::ks.test in R for Discrete Data

Solved – How to interpret Kolmogorov-Smirnov Test results in R

Related Question