I am fitting my experimental data with different distributions, I am computing Anderson Darling statistic for my data and theoretical distributions. I want to compute P value from Anderson Darling statistic without using the tables, How can I compute P value?
Solved – Can we compute P-value of Anderson Darling Test with AD statistic without using the tables given
anderson darling testp-value
Related Solutions
The next step is to compare your value to a critical value for the test-statistic. From the same Wikipedia page:
If $A^{*2}$ exceeds a given critical value, then the hypothesis of normality is rejected with some significance level.
Meaning, your null hypothesis is that the data are generated from a normal distribution, and an $A^{*2}$ exceeding the critical value implies non-normality at that level of significance. But, the $A^{*2}$ statistic for a normal distribution is not itself normally distributed, per this resource:
The Anderson-Darling test makes use of the specific distribution in calculating critical values. This has the advantage of allowing a more sensitive test and the disadvantage that critical values must be calculated for each distribution.
The same source points to books and papers for these critical values. Perhaps you might be able to find CDFs for each $A^2$ statistic, and implement the p-value.
For a fully specified distribution, the Anderson-Darling - as with the Kolmogorov-Smirnov, the Cramer-von Mises, the Kuiper test and many other ecdf-based tests - is distribution-free.
So you don't need tables for the 'standard t' such as that represented by the cdf function pt
. All you need do is apply pt
to your data (data
$^\dagger$) and test that for uniformity ... which is effectively what these tests all do, and that's how goftest::ad.test
works -- it uses fully specified distributions.
The asymptotic distribution of the Anderson-Darling test statistic for completely specified distributions was worked out by Anderson and Darling (1952, 1954) -- the 1952 paper gives the theory of computation of the asymptotic distribution (of a large class of tests of the Cramer-von Mises type, including specific discussion of what would become known as the Anderson-Darling test) and the 1954 paper gives asymptotic 10%, 5% and 1% critical values for the Anderson-Darling statistic.
In that paper they say that convergence to the asymptotic distribution is rapid and suggest it should be okay by $n=40$. Stephens (1974) suggests using it for $n\ge 5$.
Peter Lewis (1961) did tabulations of the distribution for $n\le 8$.
A little testing of goftest::ad.test
suggests that the code isn't using the asymptotic distribution down at n=10, however (e.g. simulation at n=10 shows that the 5% CV there is around 2.512, which is larger than the asymptotic value). So if all else fails, let's read the help.
The help for ad.test
refers to Marsaglia and Marsaglia (2004). They use simulation for $n=2^k$ for $k=3,4,5,6,7$ to identify a simple 3-piece transformation of the small-sample test statistic (scaled by 1/n) so that the asymptotic distribution can be used for small $n$. So it seems their code takes the statistic, scales it a little bit according to their piecewise transformation, and then compares that to the asymptotic distribution.
Anderson, T. W.; Darling, D. A. (1952).
"Asymptotic theory of certain "goodness-of-fit" criteria based on stochastic processes"
Annals of Mathematical Statistics 23: 193–212.
Anderson, T.W. and Darling, D.A. (1954).
"A Test of Goodness-of-Fit",
Journal of the American Statistical Association 49: 765–769.
Stephens M.A. (1974)
"EDF Statistics for Goodness of Fit and Some Comparisons,"
Journal of the American Statistical Association, 69:347 730-737
P.A. Lewis, (1961),
"Distribution of the Anderson-Darling Statistic,"
Ann. Math. Stat., 32 1118-1124.
Marsaglia, G. and Marsaglia, J. (2004)
"Evaluating the Anderson-Darling Distribution."
Journal of Statistical Software, 9 (2), 1-5. February.
http://www.jstatsoft.org/v09/i02
$\dagger$ About which:
install.packages("fortunes")
library(fortunes)
fortune(77)
Best Answer
If you don't want to interpolate from pre-computed tables, you may want to do bootstrap-based simulation. Try adSim (https://cran.r-project.org/package=qualityTools)
Let's use normal distribution in our example. You'll need to change the string to something else if you don't like to test for normality.
Interpolation
The table critical values are 75%, 90%, 95%, 97.5% and 99% percentile (available in the source code https://github.com/cran/qualityTools/blob/master/R/adSim.R).
Boostrap simulation