Solved – Distribution specificity of the Anderson-Darling test

anderson darling testdistributionsgoodness of fitkolmogorov-smirnov test

When fitting a (gamma) distribution to a data sample and in-turn estimating its parameters (shape/scale), a Kolmogorov–Smirnov goodness-of-fit test is no longer applicable as the distribution being tested against has been derived from the data itself – however this problem can be largely alleviated using parametric bootstrapping where new critical values and a new p-value are calculated through simulation.

I've read how an Anderson-Darling test (a modified form of the Kolmogorov–Smirnov test) typically performs better and is more sensitive to outliers.

It is my understanding that the Anderson-Darling test does not suffer from the same issue outlined above as it calculates critical values for each distribution being tested against as an inherent part of the test – is this correct? Is this done an individual distribution basis? For example, will $Γ(α,β)$ with $α=3,β=10$ produce a different set of critical values than $α=2,β=15$?

Secondly, are there any cases when the Kolmogorov–Smirnov test is preferred over the Anderson-Darling test?

Best Answer

It is my understanding that the AD-test does not suffer from the same issue outlined above as it calculates critical values for each distribution being tested against as an inherent part of the test - is this correct?

No, it's a test for a fully specified distribution just like the Kolmogorov-Smirnov. When you estimate parameters but use the Kolmogorov-Smirnov statistic with different tables to account for that, it's properly called the Lilliefors' test; this test is discussed in numerous posts on site.

However, in many cases you can adjust the Anderson-Darling test statistic under the estimation of parameters. (Failing that, your approach of simulation to either get new critical values or p-values for the specific case at hand can work quite well.)

For example, in the case of estimating mean and variance and testing for normality, if the usual statistic $A^2$ is replaced with $A^{*2}=(1+4/n-25/n^2)A^2$ then the usual tables may be used to reasonable accuracy even at quite small sample sizes.

Alternatively in the case of normality, there's a different adjusted statistic $A^{*2}=(1+0.75/n+2.25/n^2)A^2$ (with its own table). To my recollection, details are given in D'Agostino and Stephens (1986) [1], but this result (and more) can be seen in this technical report Stephens (1979); it also has an adjusted statistic for the exponential case (and several other distributions, but not the gamma).

Note that the above-linked report gives an adjustment for the exponential (again with its own table). This page gives one for the Weibull and Gumbel cases.

Is this done an individual distribution basis? For example, will $Γ(α=3,β=10)$ produce a different set of critical values than $Γ(α=2,β=15)$?

Certainly in the case of the beta parameter, you only need to adjust for the fact of estimation itself -- the value of the parameter will make no difference (this is because -- depending on the parameterization you intended -- $\beta$ is either a scale parameter or the inverse of a scale parameter). It's not clear that the alpha parameter always has the same property, but I can be reasonably confident that except at small values of $\alpha$ (i.e. except down near 1 and below 1) it won't matter much. If your gamma has a peak that's well to the right of 0 (i.e. $\alpha>>1$), the cube root is an approximately normalizing transform, so different $\alpha$ shouldn't impact the distribution much -- most of the effect will be caused by estimation of alpha rather than the value (this will be the case more generally -- if there's a monotonic transformation which doesn't depend on the parameters that produces a location-scale family, the parameter values themselves won't matter).

So the fact that there's a transformation to almost-normality for any reasonably large $\alpha$ suggests that the adjustment used for the normal ($A^{*2}=(1+4/n-25/n^2)A^2$) may also work fairly well for gammas -- at least those with non-small shape parameter.

For cases where the shape parameter is -- or might be -- small (somewhere in the region of 1, or smaller), you may need to consider dealing with it individually, but I don't know for sure that it's necessary (the same adjustment may also work okay even there, but you'd need to check).

are there any cases when the KS-test is preferred over the AD-test?

If by "is preferred" you mean "has greater power", then certainly. Where the Anderson-Darling tends to perform better at picking up differences in the tail (especially where the tail may be heavier than supposed), the Kolmogorov-Smirnov tends to have better power at differences "in the middle", and where tails are lighter than supposed. [In some of these situations -- such as testing a uniform against a beta with parameters somewhat larger than 1 -- both of them perform badly and you probably would prefer something else over either, but the Kolmogorov-Smirnov can be considerably less terrible than the Anderson-Darling.]


Note that I wouldn't really call the Anderson-Darling a "modified form of the Kolmogorov-Smirnov". I'd say it's a modified (specifically, weighted) form of the Cramer-von Mises test. They're all tests based on the ECDF, but the Kolmogorov-Smirnov looks at the maximum distance while the Cramer-von Mises looks at something related to a sum-of-squared distances, which looks at things quite differently (and has generally better power against the more interesting alternatives). The Anderson-Darling then adjusts for the fact that the variance of the value of the ECDF is smaller near $0$ or $1$ than it is near $\frac12$, to re-weight those squared deviations for the relative precision (which is why it does better at finding deviations in the tail, especially ones that would tend to be associated with being "further" into the tail of the specified distribution).

[1] D'Agostino, R.B.; Stephens, M.A. (1986),
Goodness-of-Fit Techniques,
New York: Marcel Dekker.

Related Question