Solved – Required number of permutations for a permutation-based p-value

hypothesis testingp-valuepermutation-testresampling

If I need to calculate a permutation-based $p$-value with significance level $\alpha$, how many permutations do I need ?

From the article "Permutation Tests for Studying Classifier Performance", page 5 :

In practice, the upper bound $1/(2\sqrt{k})$ is typically used to determine the number of
samples required to achieve the desired precision of the test.

… where $k$ is the number of permutations.

How do I calculate the number of required permutations from this formula ?

Best Answer

I admit, the paragraph might be confusing.

When performing a permutation test you do estimate a p-value. The issue is, that the estimation of the p-value has an error itself which is calculated as $\sqrt{\frac{p(1-p)}{k}}$. If the error is too large, the p-value is unreliable.

So how many permutations k does one need to get a reliable estimate ?

First define your maximum allowed error aka the precision. Let this be $P$. Then an estimated p-value shall be in the interval $[p-3*P,p+3*P]$ (since p is approximately normal distributed)

Using the upper bound

The cited paragraph of the paper suggests to use $\frac{1}{2\sqrt{k}}$ as an upper bound estimate of the error instead of $\sqrt{\frac{p(1-p)}{k}}$. This corresponds to a unknown p-value of p=0.5 (where the error is maximum among all ps for a fixed k).

So: You want to know k where $\frac{1}{2\sqrt{k}}\le P$.

<=> $\frac{1}{4P^2}\le k$

But since the cited formula represents an upper bound, this approach is very rough.

Using the error at the significance level

Another approach uses the desired significance level $\alpha$ as p to calculate the required precision. This is correct, because the error of the estimated p is more important if we are near the decision threshold (which is the significance level).

In this case one wants to know k where $\sqrt{\frac{\alpha(1-\alpha)}{k}}\le P$.

<=> $\frac{(\alpha(1-\alpha))}{P^2}\le k$

Note that if the true unkown p-value is clearly bigger than $\alpha$, then the error is actually bigger, so p in $[p-3*P,p+3*P]$ does not hold anymore.

Extending the confidence interval

This approach corresponds with the center of the confidence interval being right at the decision threshold. In order to force the upper bound of the confidence interval of the estimated p being below the decision threshold (which is more correct), one needs ...

$l\sqrt{\frac{\alpha(1-\alpha)}{k}}\le P$

<=> $(l)^2\frac{(\alpha(1-\alpha))}{P^2}\le k$

where l corresponds to (see again the graphic)

| l | confidence interval |
| 1 | ~68 % |
| 2 | ~95 % |
| 3 | ~99 % |

Examples: Let the desired precison P be 0.005.

Then using the rough upper bound one gets $k>=10000$.

Using P at $\alpha=0.05$ and requesting a 95%-confidence interval one gets $k>=7600$.

For P=0.01 at $\alpha=0.01$ and a 95 % confidence interval one gets k>=396.

Finally: I strongly suggest to dive deeper into Monte-Carlo simulations. The wikipedia provides a start.

Related Question