Nonparametric – Why is the Asymptotic Relative Efficiency of the Wilcoxon Test $3/\pi$ Compared to Student’s T-Test?

asymptoticsefficiencynonparametricwilcoxon-mann-whitney-testwilcoxon-signed-rank

It is well-known that the asymptotic relative efficiency (ARE) of the Wilcoxon signed rank test is $\frac{3}{\pi} \approx 0.955$ compared to Student's t-test, if the data are drawn from a normally distributed population. This is true for both the basic one-sample test and the variant for two independent samples (the Wilcoxon-Mann-Whitney U). It is also the ARE of a Kruskal-Wallis test compared to an ANOVA F-test, for normal data.

Does this remarkable (for me, one of the "most unexpected appearances of $\pi$") and remarkably simple result have have an insightful, remarkable or simple proof?

Best Answer

Brief sketch of ARE for one-sample $t$-test, signed test and the signed-rank test

I expect the long version of @Glen_b's answer includes detailed analysis for two-sample signed rank test along with the intuitive explanation of the ARE. So I'll skip most of the derivation. (one-sample case, you can find the missing details in Lehmann TSH).

Testing Problem: Let $X_1,\ldots,X_n$ be a random sample from location model $f(x-\theta)$, symmetric about zero. We are to compute ARE of signed test, signed rank test for the hypothesis $H_0: \theta=0$ relative to t-test.

To assess the relative efficiency of tests, only local alternatives are considered because consistent tests have power tending to 1 against fixed alternative. Local alternatives that give rise to nontrivial asymptotic power is often of the form $\theta_n=h/\sqrt{n}$ for fixed $h$, which is called Pitman drift in some literature.

Our task ahead is

  • find the limit distribution of each test statistic under the null
  • find the limit distribution of each test statistic under the alternative
  • compute the local asymptotic power of each test

Test statisics and asymptotics

  1. t-test (given the existence of $\sigma$) $$t_n=\sqrt{n}\frac{\bar{X}}{\hat{\sigma}}\to_dN(0,1)\quad \text{under the null}$$ $$t_n=\sqrt{n}\frac{\bar{X}}{\hat{\sigma}}\to_dN(h/\sigma,1)\quad \text{under the alternative }\theta=h/\sqrt{n}$$
    • so the test that rejects if $t_n>z_\alpha$ has asymptotic power function $$1-\Phi\left(z_\alpha-h\frac{1}{\sigma}\right)$$
  2. signed test $S_n=\frac{1}{n}\sum_{i=1}^{n}1\{X_i>0\}$ $$\sqrt{n}\left(S_n-\frac{1}{2}\right)\to_dN\left(0,\frac{1}{4}\right)\quad \text{under the null }$$ $$\sqrt{n}\left(S_n-\frac{1}{2}\right)\to_dN\left(hf(0),\frac{1}{4}\right)\quad \text{under the alternative }$$ and has local asymptotic power $$1-\Phi\left(z_\alpha-2hf(0)\right)$$
  3. signed-rank test $$W_n=n^{-2/3}\sum_{i=1}^{n}R_i1\{X_i>0\}\to_dN\left(0,\frac{1}{3}\right)\quad \text{under the null }$$ $$W_n\to_dN\left(2h\int f^2,\frac{1}{3}\right)\quad \text{under the alternative }$$ and has local asymptotic power $$1-\Phi\left(z_\alpha-\sqrt{12}h\int f^2\right)$$

Therefore, $$ARE(S_n)=(2f(0)\sigma)^2$$ $$ARE(W_n)=(\sqrt{12}\int f^2\sigma)^2$$ If $f$ is standard normal density, $ARE(S_n)=2/\pi$, $ARE(W_n)=3/\pi$

If $f$ is uniform on [-1,1], $ARE(S_n)=1/3$, $ARE(W_n)=1/3$

Remark on the derivation of distribution under the alternative

There are of course many ways to derive the limiting distribution under the alternative. One general approach is to use Le Cam's third lemma. Simplified version of it states

Let $\Delta_n$ be the log of the likelihood ratio. For some statistic $W_n$, if $$ (W_n,\Delta_n)\to_d N\left[\left(\begin{array}{c} \mu\\ -\sigma^2/2 \end{array}\right),\left(\begin{array}{cc} \sigma^2_W & \tau \\ \tau & \sigma^2/2 \end{array}\right)\right]\\ $$ under the null, then $$W_n\to_d N\left(\mu+\tau,\sigma^2_W\right)\quad\text{under the alternative}$$

For quadratic mean differentiable densities, local asymptotic normality and contiguity are automatically satisfied, which in turn implies Le Cam lemma. Using this lemma, we only need to compute $\mathrm{cov}(W_n,\Delta_n)$ under the null. $\Delta_n$ obeys LAN $$\Delta_n\approx \frac{h}{\sqrt{n}}\sum_{i=1}^{n}l(X_i)-\frac{1}{2}h^2I_0$$ where $l$ is score function, $I_0$ is information matrix. Then, for instance, for signed test $S_n$ $$\mathrm{cov}(\sqrt{n}(S_n-1/2),\Delta_n)=-h\mathrm{cov}\left(1\{X_i>0\},\frac{f'}{f}(X_i)\right)=h\int_0^\infty f'=hf(0)$$