Goodness-of-Fit – Anderson-Darling Test vs Cramér-von Mises Criterion Explained

anderson darling testgoodness of fit

I'm reading web pages for goodness of fit tests, when I came to the Anderson–Darling test and the Cramér–von Mises criterion.

So far I got the point; it seems the Anderson–Darling test and the Cramér–von Mises criterion are similar, just based on a different weighting function $w$. Also there's a variant of the Cramér–von Mises criterion named the Watson test.

Basically I have two questions here

  1. There are not many Google results about these two methods; are they still state-of-the-art? or replaced by some better approaches already?

    It's a bit of a surprise, as according to this paper on power comparisons of Shapiro–Wilk, Kolmogorov–Smirnov, Lilliefors and Anderson-Darling tests, AD is performing quite well; always better than Lilliefors and KS, and very close to the SW test, which is specifically designed for the normal distribution.

  2. What is the confidence interval for such tests?

    For the AD, CM and Watson tests, I saw the test statistics variable defined on the wiki pages, but didn't find the confidence interval.

    Things are just more straightforward for the KS test: on the wiki page, the confidence interval is defined by $K_\alpha$, which is defined from the cumulative distribution function of $K$.

Best Answer

There can be no single state-of-the-art for goodness of fit (for example no UMP test across general alternatives will exist, and really nothing even comes close -- even highly regarded omnibus tests have terrible power in some situations).

In general when selecting a test statistic you choose the kinds of deviation that it's most important to detect and use a test statistic that is good at that job. Some tests do very well at a wide variety of interesting alternatives, making them decent default choices, but that doesn't make them "state of the art".

The Anderson Darling is still very popular, and with good reason. The Cramer-von Mises test is much less used these days (to my surprise because it's usually better than the Kolmogorov-Smirnov, but simpler than the Anderson-Darling -- and often has better power than it on differences "in the middle" of the distribution)

All of these tests suffer from bias against some kinds of alternatives, and it's easy to find cases where the Anderson-Darling does much worse (terribly, really) than the other tests. (As I suggest, it's more 'horses for courses' than one test to rule them all). There's often little consideration given to this issue (what's best at picking up the deviations that matter the most to me?), unfortunately.

You may find some value in some of these posts:

Is Shapiro–Wilk the best normality test? Why might it be better than other tests like Anderson-Darling?

2 Sample Kolmogorov-Smirnov vs. Anderson-Darling vs Cramer-von-Mises (about two-sample tests but many of the statements carry over

Motivation for Kolmogorov distance between distributions (more theoretical discussion but there are several important points about practical implications)


I don't think you'll be able to form a confidence interval for the cdf in the Cramer-von Mises and Anderson Darline statistics, because the criteria are based on all of the deviations rather than just the largest.

Related Question