Solved – Significant p value for Mann-Whitney U test but low AUC

aucscipywilcoxon-mann-whitney-test

How is it possible that for two sample sets I'm getting a low p-value, but also a low AUC value (just below 0.5)?

To compute the P-value I'm looking at the second outputted value of the function here http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.mannwhitneyu.html

For the AUC I'm using the same function's first outputted value divided by the product of the sample sets lengths.

And here is a boxplot of the two series:

boxplot of the two series

Best Answer

Just looking at the boxplot heavily indicated that @Scortchi is correct with his comment. The number of outliers alone indicate that you have a very high sample size, so a very high power to find differences. This means you have strong evidence for a very small discrimination, which is usually not of high interest (practically speaking).

Mann-Whitney p-values (using the normal approximation) vs AUC for some different sample sizes ($n_1,n_2$):

AUC vs p-value