I'm trying to use KS test to determine whether one group of data is scholastically dominates another. So I'm studying dataset regarding performance of companies, which are divided into 2 groups. Instead of comparing mean values for this two groups, I follow [1] and want to compare distributions using KS test (Table 3). They do two tests: one sided (A less then B) and two sided (equality). For that I use STATA's ksmirnov command, the problem is how to interpret the output. It return D and p but what one can conclude from these values is not clear for me. For instance, for my groups it returns:
. ksmirnov performance, by(myGroup)
Two-sample Kolmogorov-Smirnov test for equality of distribution functions
Smaller group D P-value Corrected
----------------------------------------------
0: 0.0047 0.972
1: -0.1635 0.000
Combined K-S: 0.1635 0.000 0.000
The 0 is checking hypothesis that group0 has smaller values then group1. The 1 for hypothesis that group0 has larger values then group1. But I do not understand how to interpret D and p. What is the unit of D and is it big enough to accept hypothesis (for instance, for the confidence 0.05)?
Best Answer
The
D
s are the test statistics and they derive from the differences between the empirical cumulative distribution functions of the two groups. Therefore, they are the differences of probabilities. The p-values have their normal interpretation: if $pval \leq \alpha$, reject the null hypothesis; where $\alpha$ is a predetermined significance level.Stata also gives an additional p-value for the non-directional hypothesis (Combined K-S), corrected for small samples.
Examples and details of what Stata does are in [R] ksmirnov, including the math in the Methods and formulas section.
An example of a "manual" computation of the
D
s is:The "manual" approach is from [1], which Stata cites in its manual.
[1] Riffenburgh, R. H. 2005. Statistics in Medicine. 2nd ed. New York: Elsevier.