Solved – Anderson-Darling code test

anderson darling testnormal distribution

I'm really rusty at statistics and I'm trying to write some C# code where I feed in a list of numbers and it tells me whether or not the numbers are normally distributed. I generated 50 numbers from the following site with a mean of 0 and a variance of 1.

http://www.random.org/gaussian-distributions/?mode=advanced

The algorithm I'm trying to use is the Anderson-Darling test (http://en.wikipedia.org/wiki/Anderson%E2%80%93Darling_test). I implemented

 A^2 = -N - 1 / N * sum(1, N)( (2*i - 1) * (ln Phi(Y[i]) + ln (1-Phi(Y[n+1-i]) ) ) ) 

(It's about half-way down the page, the case where the mean and variance are both known.)

The Phi function comes from http://www.johndcook.com/csharp_phi.html

When I run the code I wrote on an actual normal distribution, I get a value of -3.05 back.

Is the next step to look this number up in a table of normal distribution critical values to get the associated probability? -3.05 maps to 0.0011. Does this mean that my data has a .11% chance of coming from a normal distribution (assuming my code is correct)

Best Answer

The next step is to compare your value to a critical value for the test-statistic. From the same Wikipedia page:

If $A^{*2}$ exceeds a given critical value, then the hypothesis of normality is rejected with some significance level.

Meaning, your null hypothesis is that the data are generated from a normal distribution, and an $A^{*2}$ exceeding the critical value implies non-normality at that level of significance. But, the $A^{*2}$ statistic for a normal distribution is not itself normally distributed, per this resource:

The Anderson-Darling test makes use of the specific distribution in calculating critical values. This has the advantage of allowing a more sensitive test and the disadvantage that critical values must be calculated for each distribution.

The same source points to books and papers for these critical values. Perhaps you might be able to find CDFs for each $A^2$ statistic, and implement the p-value.

Related Question