Solved – Can you use the Kolmogorov-Smirnov test to directly test for equivalence of two distributions

distributionsequivalencekolmogorov-smirnov testtost

There has been talk on other questions of how one might use the Two One-Sided Tests (TOST) approach for the Kolmogorov-Smirnov (KS) test, but I was wondering whether it was possible to directly use the test statistic to show that two distributions were similar?

As far as I understand it, the KS test statistic represents the biggest difference between two CDFs, with the one-sample version being used originally as a goodness-of-fit test. This is shown in [1] as being when the empirical distribution crosses outside the confidence interval (i.e. any one point is too far from the hypothetical distribution they are testing against).

If the two-sample version is often used to show that two distributions are significantly different to one-another, in a similar way to the one-sample version, can we invert the calculation of the confidence intervals from using $(1-\alpha) = 0.05$ to instead use $(1-\alpha) = 0.95$, as a way of showing that the maximum difference between the two distributions is significantly similar?

[1] Massey, F. "The Kolmogorov-Smirnov test for goodness-of-fit", Journal of the American Statistical Association, vol. 46, no. 253, pp. 68-78, Mar 1951

Best Answer

When conducting the Kolmogorov-Smirnov test, we assume $H_0:$ the two distributions are equivalent. We then calculate a test statistic and, if the corresponding $p$-value is small enough, we reject $H_0$ and conclude $H_A:$ the two distributions are different.

As far as hypothesis tests go, we use a $p$-value to quantify the amount of evidence we have to reject the null hypothesis. A $p$-value of 1 indicates that we have gathered no evidence to reject the null hypothesis. A $p$-value close to 0 indicates there is overwhelming evidence to reject the null hypothesis.

Let's assume we have data and calculate a $p$-value from the K-S test where $p=0.99.$ This indicates there is very little evidence to reject the null hypothesis. However, we cannot establish a standard of $\alpha=0.95$ such that $p>\alpha$ implies that we conclude the null hypothesis is correct. Further, I don't believe there is an alternate test that would allow us to conclude that the two distributions are the same.

What I believe you can do is to be entirely honest in the write-up or discussion. Mention that you ran a K-S test, report a $p$-value, and if the $p$-value is sufficiently high, then articulate that there is very little evidence to suggest that the two distributions are different. So, while you cannot conclude that the distributions are identical, you should be able to note that there is no evidence suggesting that the two distributions are different. As your sample size $n$ increases, the more faith you'll have in this answer.

It's not quite the answer that you were probably looking for, but it's not a total wash, either. Hope this helps!