I have a data-set with n = 90, probably follows the gamma distribution (and others). I used the maximum-likelihood estimation (MLE) to estimated the alpha and beta parameters of the gamma distribution using Matlab.
What is the best way to test the fit (goodness of fit) of the gamma distribution with the estimated parameters versus the original data-set ?
Can I compare the cumulative distribution function (cdf) – empirical vs theoretical ?
empirical_cdf = ecdf ( data set )
theoretical_cdf = cdf ( gammafit )
And make same test, for example the KS two samples
kstest2 ( empirical_cdf, theoretical_cdf )
Is this the correct way ?
Many thanks
The histogram in the last question is only a example of 1 data-set (1 of 10000).
I'll rephrase my question, I have a total of 10000 data-sets, and I wonder if the Gamma distribution is better (in terms of goodness-of-fit) that Weibull distribution for example.
or
For a data-set of 10000 what percentage fit better to gamma, and what percentage fit better to Weibull distribution ?
As you can see my data-set is big, and impossible to check one-by-one.
What is the best way to do the goodness-of-fit to found this percentages ?
Many thanks
Best Answer
I don't use matlab, but how about we check the documentation of the function. It says:
So no, that's not used for comparing a fitted distribution to a sample.
What about
kstest
? Well, if we check the documentation there, the answer is still no:That pretty much covers it. There's a Lilliefors test (matlab has a function for the normal case, mentioned in the documentation for
kstest
). You could do something similar to that by simulating the distribution of the test statistic.But often people test goodness of fit in situations in which it's not really useful to do so. (This may be the case here as well - why are you testing goodness of fit?)