Solved – Conflicting results with GoF tests

goodness of fitMATLAB

I am a computer scientist with little statistics background and I am trying to find the best fitting distribution for some data set (using MATLAB). To assess the goodness of fit I use both Kolmogorov Smirnov (KS) and Anderson Darling (AD) tests, and here are the p-values for the same data set:

Distribution          AD Test     KS Test
Exp                   0.439       1.49e-7
Weibull               0.498       1.40e-6
Pareto                0.244       6.24e-14
Logn                  0.684       2.69e-4
Gamma                 0.595       2.16e-4

I use a significance level of 0.05, and as far as I know with a p-value < 0.05 the null hypothesis is rejected which is the case for the KS test results. Then what should I conclude based on this result? The KS test says that none of the distributions is a good fit while the AD test can't reject the null hypothesis.

Edit:
Here is an overview of what we do in our code:

fit_functions = { @wblfit, @expfit, @lognfit, @gpfit, @gamfit};
for i=1:length(fit_functions)
    [varargout{1:x}] = fit (param, fit_functions {i});
    [ad_result ks_result] = run_gof_tests(param, cdf_functions{i}, ... varargout{:});

and in run_gof_tests function we average 1,000 p-values to calculate the final p-values. Each p-value is computed by drawing fifty samples randomly from the data set. We have used this method due to reasons described in this tech report on pg 12: Modeling Machine Availability in Enterprise and Wide-area Distributed Computing Environments by Nurmi et al., UCSB Computer Science Technical Report Number CS2003-28.

Thanks!

Best Answer

Remember that both tests are rule out tests and they measure the difference between the null distribution and the data in different ways. Also, how did you choose the parameters for the distributions that you tested? sometimes that can influence how the tests behave, especcially if the parameters are estimated from the data, but that is not taken into account, or if default values are used which don't match the data (and may be different defaults for the 2 tests).

Considere 2 distributions, the first is the standard uniform between 0 and 1, the second is also uniform with a value of 1 between 0 and 0.99 and also equals 1 between 9,999.99 and 10,000.00 and is 0 elsewhere. Are these distributions very different from each other? For most small samples they are going to look almost identical, but the second will produce occasional outliers compared to the 1st. The KS test looks at the maximum difference between the cumulative distributions, so for these 2 distributions they are identical until 0.99, then have a difference of 0.01, so that is the difference the KS test will be unlikely to see a difference. The AD test also looks at this difference, but uses a different weighting. A test based on mean and variance would see a huge difference between these 2 distributions because of the huge difference between the means and variances. Often when different tests disagree it is because of the differences in what they are measuring.

Related Question