Solved – Q-Q plot and KS test

kolmogorov-smirnov testqq-plotr

I have a question regarding the QQ plot and KS test.
I have a sample which is some positions on genome, and I calculated the distance between these positions.
And I generated the null distribution by randomly draw positions from genome and calculated the distance between them in the same way. (1000 times)
I want to test if the distribution of the distance of my sample are the same with the distribution of distance of randomly selected positions.

First I used the QQ plot to test it, and I get the QQ plot looks like this:

QQ plot on log10 scale

The QQ plot is skewed, then I want to use KS test to see if they are from the same distribution. However, I could not get a significant P-value from using ks.test() function in R. The P-value is around 0.09…

To be clear, this is the histogram of null distribution I generated by simulation:
enter image description here

Does anyone have idea why I couldn't get significant result from KS-test? Or if there is any alternative method I could use to test the difference between this two distributions? Thanks so much!

Best Answer

Got too long for a comment.

It looks to me like you might (perhaps) be confusing together two different things and that might where your problem comes from - but it's not clear enough what you did to be sure.

If you want to do a KS test of the distribution in your initial Q-Q plot, the data there should be the thing passed to the KS test. What are the theoretical values in that plot? How were they obtained?

If the histogram is the simulated distribution of some statistic (as the title suggests), you wouldn't need a KS test - you just look to see where your sample value lies in that distribution.

What exactly is being displayed in the histogram? What was the sample value you're comparing with it? What are the arguments to the KS test? Is it a two-sample test or is it being compared with some theoretical distribution.

Can you please explain what you're doing at each stage and what numbers you're doing it to? (That is, if we had your original sample, how could we reproduce what you did?)