Kolmogorov-Smirnov Test – Understanding p-Value and KS-Test Statistic Decrease as Sample Size Increases

goodness of fitintuitionp-valuepythonscipy

Why do p-values and ks-test statistics decrease with increasing sample size? Take this Python code as an example:

import numpy as np
from scipy.stats import norm, ks_2samp
np.random.seed(0)
for n in [10, 100, 1000, 10000, 100000, 1000000]:
  x = norm(0, 4).rvs(n)
  y = norm(0, 4.1).rvs(n)
  print ks_2samp(x, y)

The results are:

Ks_2sampResult(statistic=0.30000000000000004, pvalue=0.67507815371659508)
Ks_2sampResult(statistic=0.080000000000000071, pvalue=0.89375155241057247)
Ks_2sampResult(statistic=0.03499999999999992, pvalue=0.5654378910227662)
Ks_2sampResult(statistic=0.026599999999999957, pvalue=0.0016502962880920896)
Ks_2sampResult(statistic=0.0081200000000000161, pvalue=0.0027192461984023855)
Ks_2sampResult(statistic=0.0065240000000000853, pvalue=6.4573678008760032e-19)

Intuitively I understand that as n grows, the test is "more sure" the two distributions are different. But if the sample size is very large, what is the point in similarity tests such as this and say the Anderson Darling test, or the t-test, because in such cases when n is very large, the distributions will always be found to be "significantly" different!? Now I'm wondering what on earth the point of p-values are. It depends so much on the sample size… if p > 0.05 and you want it to be lower, just get more data; and if p < 0.05 and you want it to be higher, just remove some data.

Also, if the two distributions were identical, the ks-test statistic would be 0 and the p-value 1. But in my example, as n increases the ks-test statistic suggests the distributions become more and more similar with time (decreases), but according to the p-value they become more and different with time (also decreases).

Best Answer

The test statistic decreases because your distributions are very similar and larger samples have less noise. If you were to compare the two theoretical distributions you used, you should get the "true" KS statistic. As you add more data, your estimated KS statistic should approach this true value. However, even as your KS statistic decreases, your confidence increases that they are in fact two different distributions (i.e. p-value decreases) because you have greater confidence in your estimates of the individual distributions.

Related Question