The question is quite vague so I am going to assume you want to choose an appropriate performance measure to compare different models. For a good overview of the key differences between ROC and PR curves, you can refer to the following paper: The Relationship Between Precision-Recall and ROC Curves by Davis and Goadrich.
To quote Davis and Goadrich:
However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an algorithm's performance.
ROC curves plot FPR vs TPR. To be more explicit:
$$FPR = \frac{FP}{FP+TN}, \quad TPR=\frac{TP}{TP+FN}.$$
PR curves plot precision versus recall (FPR), or more explicitly:
$$recall = \frac{TP}{TP+FN} = TPR,\quad precision = \frac{TP}{TP+FP}$$
Precision is directly influenced by class (im)balance since $FP$ is affected, whereas TPR only depends on positives. This is why ROC curves do not capture such effects.
Precision-recall curves are better to highlight differences between models for highly imbalanced data sets. If you want to compare different models in imbalanced settings, area under the PR curve will likely exhibit larger differences than area under the ROC curve.
That said, ROC curves are much more common (even if they are less suited). Depending on your audience, ROC curves may be the lingua franca so using those is probably the safer choice. If one model completely dominates another in PR space (e.g. always have higher precision over the entire recall range), it will also dominate in ROC space. If the curves cross in either space they will also cross in the other. In other words, the main conclusions will be similar no matter which curve you use.
Shameless advertisement. As an additional example, you could have a look at one of my papers in which I report both ROC and PR curves in an imbalanced setting. Figure 3 contains ROC and PR curves for identical models, clearly showing the difference between the two. To compare area under the PR versus area under ROC you can compare tables 1-2 (AUPR) and tables 3-4 (AUROC) where you can see that AUPR shows much larger differences between individual models than AUROC. This emphasizes the suitability of PR curves once more.
First thing, let's try to define the area under the ROC curve formally. Some assumptions and definitions:
We have a probabilistic classifier that outputs a "score" s(x), where x are the features, and s is a generic increasing monotonic function of the estimated probability p(class = 1|x).
$f_{k}(s)$, with $k = \{0, 1\}$ := pdf of the scores for class k, with CDF $F_{k}(s)$
The classification of a new observation is obtained compraing the score s to a threshold t
Furthermore, for mathematical convenience, let's consider the positive class (event detected) k = 0, and negative k = 1. In this setting we can define:
- Recall (aka Sensitivity, aka TPR): $F_{0}(t)$ (proportion of positive cases classified as positive)
- Specificity (aka TNR): $1 - F_{1}(t)$ (proportion of negative cases classified as negative)
- FPR (aka Fall-out): 1 - TNR = $F_{1}(t)$
The ROC curve is then a plot of $F_{0}(t)$ against $F_{1}(t)$. Setting $v = F_1(s)$, we can formally define the area under the ROC curve as:
$$AUC =\int_{0}^{1} F_{0}(F_{1}^{-1}(v)) dv$$
Changing variable ($dv = f_{1}(s)ds$):
$$AUC =\int_{ - \infty}^{\infty} F_{0}(s) f_{1}(s)ds$$
This formula can easiliy be seen to be the probability that a randomly drawn member of class 0 will produce a score lower than the score of a randomly drawn member of class 1.
This proof is taken from:
https://pdfs.semanticscholar.org/1fcb/f15898db36990f651c1e5cdc0b405855de2c.pdf
Best Answer
The area under the PR-Curve is ill-defined. Because there is no well-defined precision at recall 0: you get a division by zero there.
You also cannot close this gap easily - it may be anything from 0 to 1, depending on how well your retrieval works.
There is a common approximation to this - AveP, average precision.