Solved – simple equivalence test version of the Kolmogorov–Smirnov test

equivalencekolmogorov-smirnov testtost

Has two one-sided tests for equivalence (TOST) been framed for the Kolmogorov–Smirnov test to test the negativist null hypothesis that two distributions differ by at least some researcher-specified level?

If not TOST, then some other form of equivalence test?

Nick Stauner wisely points out that (I should already know 😉 that there are other nonparametric TOST equivalence tests for null hypotheses for stochastic equivalence, and, with more restrictive assumptions, for median equivalence.

Best Answer

Ok, here's my first attempt. Close scrutiny and comments appreciated!

The Two-Sample Hypotheses
If we can frame two-sample one-sided Kolmogorov-Smirnov hypothesis tests, with null and alternate hypotheses along these lines:

$\text{H}_{0}\text{: }F_{Y}\left(t\right) \geq F_{X}\left(t\right)$, and

$\text{H}_{\text{A}}\text{: }F_{Y}\left(t\right) < F_{X}\left(t\right)$, for at least one $t$, where:

  • the test statistic $D^{-}=\left|\min_{t}\left(F_{Y}\left(t\right) - F_{X}\left(t\right)\right)\right|$ corresponds to $\text{H}_0\text{: }F_{Y}\left(t\right) \geq F_{X}\left(t\right)$;

  • the test statistic $D^{+}=\left|\max_{t}\left(F_{Y}\left(t\right) - F_{X}\left(t\right)\right)\right|$ corresponds to $\text{H}_0\text{: }F_{Y}\left(t\right) \leq F_{X}\left(t\right)$; and

  • $F_{Y}\left(t\right)$ & $F_{X}\left(t\right)$ are the empirical CDFs of samples $Y$ and $X$,

then it should be reasonable to create a general interval hypothesis for an equivalence test along these lines (assuming that the equivalence interval is symmetric for the moment):

$\text{H}^{-}_0\text{: }\left|F_{Y}\left(t\right) - F_{X}\left(t\right)\right| \geq \Delta$, and

$\text{H}^{-}_{\text{A}}\text{: }\left|F_{Y}\left(t\right) - F_{X}\left(t\right)\right| < \Delta$, for at least one $t$.

This would translate to the specific two one-sided "negativist" null hypotheses to test for equivalence (these two hypotheses take the same form, since both $D^{+}$ and $D^{-}$ are strictly non-negative):

$\text{H}^{-}_{01}\text{: }D^{+} \geq \Delta$, or

$\text{H}^{-}_{02}\text{: }D^{-} \geq \Delta$.

Rejecting both $\text{H}^{-}_{01}$ and $\text{H}^{-}_{02}$ would lead one to conclude that $-\Delta < F_{Y}\left(t\right) - F_{X}\left(t\right) < \Delta$. Of course, the equivalence interval need not be symmetric, and $-\Delta$ and $\Delta$ could be replaced with $\Delta_{2}$ (lower) and $\Delta_{1}$ (upper) for the respective one-sided null hypotheses.

The Test Statistics (Updated: Delta is outside the absolute value sign)
The test statistics $D^{+}_{1}$ and $D^{-}_{2}$ (leaving the $n_{Y}$ and $n_{X}$ implicit) correspond to $\text{H}^{-}_{01}$ and $\text{H}^{-}_{02}$, respectively, and are:

$D^{+}_{1} = \Delta - D^{+} = \Delta - \left|\max_{t}\left[\left(F_{Y}\left(t\right) - F_{X}\left(t\right)\right)\right]\right|$, and

$D^{-}_{2} = \Delta - D^{-} = \Delta - \left|\min_{t}\left[\left(F_{Y}\left(t\right) - F_{X}\left(t\right)\right)\right]\right|$

The Equivalence/Relevance Threshold
The interval $[-\Delta, \Delta]$—or $[\Delta_{2}, \Delta_{1}]$, if using an asymmetric equivalence interval—is expressed in units of $D^{+}$ and $D^{-}$, or the magnitude of differenced probabilities. As $n_{Y}$ and $n_{X}$ approach infinity, the CDF of $D^{+}$ or $D^{-}$ for $n_{Y},n_{X}$ approaches $0$ for $t\le 0$, and must be $>0$ for $t > 0$:

$$\lim_{n_{Y},n_{X}\to \infty}p^{+} = \text{P}\left(\sqrt{\frac{n_{Y}n_{X}}{n_{Y}+n_{X}}}D^{+} \le t\right) = 1 - e^{-2t^{2}}$$

CDF of D^+ (or D^-)

So it seems to me that the PDF for sample size-scaled $D^{+}$ (or sample size-scaled $D^{-}$) must be $0$ for $t<0$, and must be $>0$ for $t \ge 0$:

$$f(t) = {1 - e^{-2t^{2}}}\frac{d}{dt} = 4te^{-2t^{2}}$$

PDF of D^+ (or D^-)

Glen_b points out that this is a Rayleigh distribution with $\sigma=\frac{1}{2}$. So the large sample quantile function for sample size-scaled $D^{+}$ and $D^{-}$ is:

$$\text{CDF}^{-1} = Q\left(p\right) = \sqrt{\frac{-\ln{\left(1 - p\right)}}{2}}$$

and a liberal choice of $\Delta$ might be the critical value $Q_{\alpha}+\sigma/2 = Q_{\alpha}+\frac{1}{4}$, and a more strict choice the critical value $Q_{\alpha}+\sigma/4=Q_{\alpha}+\frac{1}{8}$.