Solved – Kolmogorov-Smirnov two-sample $p$-values

kolmogorov-smirnov test

I am using the Kolmogorov–Smirnov two-sample test to compare distributions, and I noticed a $p$-value is frequently reported as the test statistic. How is this $p$-value determined? I know it's the probability of obtaining a result at least as large as the one obtained, but how is this $p$-value determined given this is a nonparametric test? That is, we can't assume Gaussian fluctuations in the distribution and compute the $p$-value using a $t$-test.

Thanks!

Best Answer

Under the null hypothesis, the asymptotic distribution of the two-sample Kolmogorov–Smirnov statistic is the Kolmogorov distribution, which has CDF

$$\operatorname{Pr}(K\leq x)=\frac{\sqrt{2\pi}}{x}\sum_{i=1}^\infty e^{-(2i-1)^2\pi^2/(8x^2)} \>.$$

The $p$-values can be calculated from this CDF - see Section 4 and Section 2 of the Wikipedia page on the Kolmogorov–Smirnov test.

You seem to be saying that a non-parametric test statistic shouldn't have a distribution - that's not the case - what makes this test non-parametric is that the distribution of the test statistic does not depend on what continuous probability distribution the original data come from. Note that the KS test has this property even for finite samples as shown by @cardinal in the comments.