1) There are two issues with the Kolmogorov-Smirnov* -
a) it assumes the distribution is completely specified, with no estimated parameters. If you estimate parameters a KS becomes a form of Lilliefors test (in this case for Poisson-ness), and you need different critical values
b) it assumes the distribution is continuous
both impact the calculation of p-values, and both make it less likely to reject.
*(and the Cramer-von Mises and the Anderson Darling, and any other test that assumes a continuous, completely specified null)
Unless you don't mind a potentially highly-conservative test (of unknown size), you have to adjust the calculation of the significance for both of these; simulation would be called for.
2) on the other hand, a vanilla chi-square goodness of fit is a terrible idea when testing something that's ordered, as a Poisson is. By ignoring ordering, it's really not very sensitive to the more interesting alternatives - it throws away power against directly interesting alternatives like overdispersion, instead spending its power against things like 'an excess of even numbers over odd numbers'. As a result its power against interesting alternatives is generally even lower than the vanilla KS but without the compensation of the much lower type I error rate.
I think this is even worse.
3) on the gripping hand, you can partition the chi-squared into components that do respect the ordering via the use of orthogonal polynomials, and drop off the less interesting highest-order components. In this particular case you'd use polynomials orthogonal to the Poisson p.f.
This is an approach taken in Rayner and Best's little 1989 book on Smooth Tests of Goodness of Fit (they have a newer one on smooth tests in R that might make your life easier)
Alternatively, see papers like this one:
http://www.jstor.org/discover/10.2307/1403470
4) However, depending on why you're doing it, it may be better to reconsider the whole enterprise...
The discussion in questions like these carry over to most goodness of fit tests ... and indeed often to most tests of assumptions in general:
Is normality testing 'essentially useless'?
What tests do I use to confirm that residuals are normally distributed?
There maybe more to it, but to me it seems that you just want to determine goodness-of-fit (GoF) for a function f(a), fitted to a particular data set (a, f(a)). So, the following only answers your third sub-question (I don't think the first and second are directly relevant to the third one).
Usually, GoF can be determined parametrically (if you know the distribution's function parameters) or non-parametrically (if you don't know them). While you may be able to figure out parameters for the function, as it appears to be exponential or gamma/Weibull (assuming that data is continuous). Nevertheless, I will proceed, as if you didn't know the parameters. In this case, it's a two-step process. First, you need to determine distribution parameters for your data set. Second, you perform a GoF test for the defined distribution. To avoid repeating myself, at this point I will refer you to my earlier answer to a related question, which contains some helpful details. Obviously, this answer can easily be applied to distributions, other than the one mentioned within.
In addition to GoF tests, mentioned there, you may consider another test - chi-square GoF test. Unlike K-S and A-D tests, which are applicable only to continuous distributions, chi-square GoF test is applicable to both discrete and continuous ones. Chi-square GoF test can be performed in R by using one of several packages: stats
built-in package (function chisq.test()
) and vcd
package (function goodfit()
- for discrete data only). More details are available in this document.
Best Answer
Does chi square work for you? You sum the quotient of the square difference of expected and observed counts per bin divided by the expected counts per bin, and can find a reduced chi square using the number of bins and number of optimizable parameters (related to degrees of freedom, for which the input 'ddof' is modified in scipy syntax). A reduced chi square close to one generally means good fit.