Distributions – Kolmogorov-Smirnov Test (ks_2samp) in SciPy

distributionspythonscipy

Newbie Kolmogorov-Smirnov question. I have 2 sample data set. When I apply the ks_2samp from scipy to calculate the p-value, its really small = Ks_2sampResult(statistic=0.226, pvalue=8.66144540069212e-23)

When I compare their histograms, they look like they are coming from the same distribution. Am I interpreting the test incorrectly?

On the scipy docs If the KS statistic is small or the p-value is high, then we cannot reject the hypothesis that the distributions of the two samples are the same. So with the p-value being so low, we can reject the null hypothesis that the distribution are the same right?

On a side note, are there other measures of distribution that shows if they are similar? Histogram overlap? KDE overlaps?

thanks,

enter image description here

Best Answer

I can't retrieve your data from your histograms. So let's look at largish datasets from a couple of slightly different distributions and see if the K-S two-sample test can discern that the two samples aren't from the same distribution. [I'm using R.]

set.seed(801)
x1 = rbeta(1000, 11, 9)
x2 = rbeta(1000, 12, 8)

summary(x1)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.2311  0.4852  0.5532  0.5517  0.6234  0.8387 
summary(x2)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.2516  0.5278  0.6043  0.5994  0.6736  0.8953 

Here are histograms of the two sample, each with the density function of its population shown for reference.

enter image description here

Somewhat similar, but not exactly the same. For example, $\mu_1 = 11/20 = 5.5$ and $\mu_2 = 12/20 = 6.0.$ Furthermore, the K-S test rejects the null hypothesis that the two samples came from the same distribution. K-S tests aren't exactly famous for their good power, but with $n=1000$ observations from each sample, the test was able to reject with P-value very near $0.$

ks.test(x1, x2)

        Two-sample Kolmogorov-Smirnov test

data:  x1 and x2
D = 0.205, p-value < 2.2e-16
alternative hypothesis: two-sided

The test statistic $D$ of the K-S test is the maximum vertical distance between the empirical CDFs (ECDFs) of the samples.

plot(ecdf(x1), xlim=0:1, col="blue")
 lines(ecdf(x2), col="brown")

enter image description here

As seen in the ECDF plots, x2 (brown) stochastically dominates x1 (blue) because the former plot lies consistently to the right of the latter.

Because the shapes of the two distributions aren't exactly the same, some might say a two-sample Wilcoxon test is not entirely appropriate. I would not want to claim the Wilcoxon test finds that the median of x2 to be larger than the median of x1, but the Wilcox test does find a difference between the two samples.

wilcox.test(x1, x2)

         Wilcoxon rank sum test with continuity correction

data:  x1 and x2
W = 372900, p-value < 2.2e-16
alternative hypothesis: true location shift is not equal to 0
Related Question