Solved – Permutation-based p-value to evaluate a classifier

classificationmachine learningp-valuepermutation-test

I read an article entitled "Permutation Tests for Studying Classifier Performance" which explains how to use "permutation-based p-value" to test the performance of a classifier.
I can not understand how to calculate the p-value. The article reports this formula:

 p = (|{D′ ∈ D: e(f,D′) ≤ e(f,D)}|+1)/(k+1)
 (DEFINITION 1)

Where e(f,D) is some error of the classifier f on the dataset D or D' (D' is the dataset obtained with permutation).

This is the article:
http://jmlr.org/papers/volume11/ojala10a/ojala10a.pdf

But I do not understand the formula. Why the error with D' is less than the error with D? and can I calculate the absolute value of an equation with "<=" ? And what is k?

The article says:

The empirical p-value of Definition 1 represents the fraction of randomized samples where
the classifier behaved better in the random data than in the original data. Intuitively, it measures how likely the observed accuracy would be obtained by chance, only because the classifier identified in the training phase a pattern that happened to be random. Therefore, if the p-value is small enough usually under a certain threshold, for example,
α = 0.05 we can say that the value of the error in the original data is indeed significantly small and in consequence, that the classifier is significant
under the given null hypothesis, that is, the null hypothesis is rejected.

Best Answer

The formula uses the default notation for sets.

So $\{D′ ∈ \hat{D}: e(f,D′) ≤ e(f,D)\}$ is read as: A set containing all permutations $D'$ of $D$ in $\hat{D}$ restricted to those where $e(f,D′) ≤ e(f,D)$.

The pipes around the set definition ("$|$") do note the cardinality / size of the set.

The textual part above the formula contains the definition of k. $\hat{D}$ is the set of all created permutations and k is the size / cardinality of this set, so $k=|\hat{D}|$

Example: Suppose you create 4 permutations $D'_1,D'_2,D'_3,D'_4$ of a dataset $D$ and calculate $e$ for every permutation. Let

$\hat{D}=\{D'_1,D'_2,D'_3,D'_4\}$
$e(f,D)=0.1$
$e(f,D'_1)=e(f,D'_2)=e(f,D'_3)=0.2$
$e(f,D'_4)=0.05$

=> p = $\frac{1+1}{4+1}$

Related Solutions

Solved – Required number of permutations for a permutation-based p-value

I admit, the paragraph might be confusing.

When performing a permutation test you do estimate a p-value. The issue is, that the estimation of the p-value has an error itself which is calculated as $\sqrt{\frac{p(1-p)}{k}}$. If the error is too large, the p-value is unreliable.

So how many permutations k does one need to get a reliable estimate ?

First define your maximum allowed error aka the precision. Let this be $P$. Then an estimated p-value shall be in the interval $[p-3*P,p+3*P]$ (since p is approximately normal distributed)

Using the upper bound

The cited paragraph of the paper suggests to use $\frac{1}{2\sqrt{k}}$ as an upper bound estimate of the error instead of $\sqrt{\frac{p(1-p)}{k}}$. This corresponds to a unknown p-value of p=0.5 (where the error is maximum among all ps for a fixed k).

So: You want to know k where $\frac{1}{2\sqrt{k}}\le P$.

<=> $\frac{1}{4P^2}\le k$

But since the cited formula represents an upper bound, this approach is very rough.

Using the error at the significance level

Another approach uses the desired significance level $\alpha$ as p to calculate the required precision. This is correct, because the error of the estimated p is more important if we are near the decision threshold (which is the significance level).

In this case one wants to know k where $\sqrt{\frac{\alpha(1-\alpha)}{k}}\le P$.

<=> $\frac{(\alpha(1-\alpha))}{P^2}\le k$

Note that if the true unkown p-value is clearly bigger than $\alpha$, then the error is actually bigger, so p in $[p-3*P,p+3*P]$ does not hold anymore.

Extending the confidence interval

This approach corresponds with the center of the confidence interval being right at the decision threshold. In order to force the upper bound of the confidence interval of the estimated p being below the decision threshold (which is more correct), one needs ...

$l\sqrt{\frac{\alpha(1-\alpha)}{k}}\le P$

<=> $(l)^2\frac{(\alpha(1-\alpha))}{P^2}\le k$

where l corresponds to (see again the graphic)

| l | confidence interval |
| 1 | ~68 % |
| 2 | ~95 % |
| 3 | ~99 % |

Examples: Let the desired precison P be 0.005.

Then using the rough upper bound one gets $k>=10000$.

Using P at $\alpha=0.05$ and requesting a 95%-confidence interval one gets $k>=7600$.

For P=0.01 at $\alpha=0.01$ and a 95 % confidence interval one gets k>=396.

Finally: I strongly suggest to dive deeper into Monte-Carlo simulations. The wikipedia provides a start.

Best Answer

Related Solutions

Solved – Required number of permutations for a permutation-based p-value

Related Question