Solved – Alternative to Kolmogorov-Smirnov test when parameters are estimated from the data

discrete datahypothesis testingkolmogorov-smirnov testnonparametric

I need to compare whether two distributions are similar when the values are scaled by the mean of each of the distribution. One limitation of ks-test as per http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm is "if location, scale, and shape parameters are estimated from the data, the critical region of the K-S test is no longer valid."

Consider for example:

Data1 consist of 10000 numbers from uniform random distribution [0,1] with mean 0.5

Data2 consist of 10000 numbers from uniform random distribution [0,10] with mean 5.001

If I compare Data1 with Data2/10 then ks-test gives that both the distribution are same; while comparing Data1/0.5 with Data2/5.001 gives that the distribution are different. Is there a way to check the similarity between the distributions in such cases?

Edit:
As the answer suggests I can use ks-test where the p-value is determined via permutation.

My additional difficulty is that the data-points are integers:

Data1 consist of 10000 integers from uniform random distribution [0,10] with mean 5

Data2 consist of 10000 integers from uniform random distribution [0,100] with mean 50.001

Is there a test to compare whether Data1 and Data2 are similar apart from the scale? Further, I do not know the actual scale and I am determining it from the data.

These examples are just a proxy for my actual data, which are two experiment where 10000 people rated a movie on a scale [0,10], while in other case 10000 different people rated the same movie on a scale [0,100]. I want to check apart from the scale can one say that whether the distributions are same or not.

Best Answer

One option is to still use the KS test statistic, but instead of using the standard p-value from the KS test (which as you say is not appropriate when estimating from the data), calculate the p-value using a permutation test. The basic steps would be:

Calculate the KS test statistic for the data as is (divided by the estimates).

Now combine the 2 datasets (already divided) and randomly split them into 2 sets of 10,000 (or whatever the original sample size was) and compute the KS test for these new "samples".

Repeat the above many times (999, or 9,999).

The p-value is the proportion of test statistics that are as extreme or more extreme than your original test statistic.

Related Solutions

Solved – the definition of one-sample Kolmogorov–Smirnov test

First, on the programing side, passing 'uniform' is essentially passing scipy.stats.uniform.cdf() to kstest. So whatever you have in args= will be passed scipy.stats.uniform.cdf() as parameters, which only takes two parameters, location and scale (see the document for detail). If you have more than two values in args=, the extra will simply ignored:

>>> a=np.random.random(10)
>>> stats.kstest(a, 'uniform', args=(0.5,1,3,4))
(0.303993262358352, 0.25725219759419549)
>>> stats.kstest(a, 'uniform', args=(0.5,1,300, 400)) #see how these two give same result
(0.303993262358352, 0.25725219759419549)

Second, since you already normalized CDF of the photon arrival times, it will make sense to do one-sample KS test against the standard uniform distribution. http://journals.ametsoc.org/doi/abs/10.1175/1520-0450%281975%29014%3C1600%3AANOTPM%3E2.0.CO%3B2 Basically what that paper says is that if If one or more parameters must be estimated from the sample, then $D$ no longer follows a Kolmogrov-Smirnov distribution and if you still the CDF of KS to get $P$ value from $D$, you will get wrong $P$. Also, I don't think it is correct apporach to generate a uniform distributed random variable and apply 2-sample KS test.

Third, the CDF of Kolmogrov-Smirnov distribution is given by: $\operatorname{Pr}(K\leq x)=1-2\sum_{k=1}^\infty (-1)^{k-1} e^{-2k^2 x^2}=\frac{\sqrt{2\pi}}{x}\sum_{k=1}^\infty e^{-(2k-1)^2\pi^2/(8x^2)}$ and this is how you can calculate $P$ from $D$. In scipy it is not provided by pure python code, but by a C extension.

Solved – Reproducibility of the two-sample Kolmogorov–Smirnov test

p-values are random quantities; they're a function of your (random) samples, so naturally they vary from sample to sample.

[Indeed, for a point null hypothesis, and with a continuous rather than discrete test statistic, then if $H_0$ were true, the p-values generated in this manner would be uniform on (0,1).]

As the situation moves further and further from the null in the manner measured by the test statistic, the distribution of p-values skews toward the low end. The p-value is always random (with a long tail to the larger p-values), but it tends to be stochastically smaller.

So you're simply expecting something that won't happen. The p-values will never be particularly consistent in value; they don't tend to concentrate around some underlying "population" value.

What you're seeing is how hypothesis tests work.

Your results - considered all together - already indicate the null is false (point nulls are rarely true, so this is not much of a surprise). If you can take larger samples, your typical p-values will become smaller. If you can't take larger samples but can generate as many hypothesis tests as you wish, you could combine them ... and get an overall p-value smaller than any given positive bound.

From the little bit of information you give in your description, it doesn't sound like a hypothesis test is a good choice for your situation. If you describe your underlying problem in greater detail (what you're trying to achieve before you get to the point of deciding to use hypothesis tests), it may be that alternatives more suited to your needs could be suggested.

Answering the question 'do they differ' is probably pointless (they do, at least a little -- enough data will tell you this). A more useful question is probably nearer to "are they different enough that it matters?"

Best Answer

Related Solutions

Solved – the definition of one-sample Kolmogorov–Smirnov test

Solved – Reproducibility of the two-sample Kolmogorov–Smirnov test

Related Question