Solved – How to use ks test for 2 vectors of scores in python

hypothesis testingkolmogorov-smirnov testMATLABp-valuepython

I am trying to figure out how to determine whether to reject or not the null hypothesis using ks-test.

  1. In matlab there is a function named kstest2 decides if it should reject $H_0$. I want to use python scipy function
    ks_2samp but it returns the p-value and the $\alpha$, how can I
    determine if one should reject $H_0$ given those 2 parameters ?
  2. If I have 2 vectors of scores per day, how can I use Matlab/python kstest2 to see if the are distributed the same?
  3. Is there a clean way to change the scores vector to a continuous distributed vector?

Best Answer

  1. Simply compare the p-value to your desired significance level. If your p-value is less than (or equal to) your significance level (your chosen type I error rate, $\alpha$), you should reject the null hypothesis. (You may need to brush up your understanding of how hypothesis testing works.)

  2. If you mean you want to combine information across many days, it depends on whether the days are going to share a distribution (within the two different groups of things being compared in the test) or not, but one approach that works in either case would be to test the distribution of p-values for uniformity against the alternative that it's typically smaller. That would give an overall test that would apply over many days. However, if you're testing every day, you may want to consider the properties of such a procedure.

  3. No. If you don't have continuous distributions you probably shouldn't be doing a KS test at all; it won't have the usual properties (e.g. type I error rates will be too low, power will be low).