The precision/recall curve for KNN classifier consists of two points effectively (since KNN predicts binary values) so such curve is not very useful or meaningful. One could instead use the fraction of a given class in the neighborhood (i.e. non-smoothed density estimate; requires K > 1) as the model prediction which would make the precision/recall curve more meaningful.
As for leveraging distance, one could try to use "distance-weighted voting" by relying on a distance-based kernel to get the weights (as one of the simplest kernel density methods) but that would not really qualify as KNN classifier since since KNN's kernel is just a step function.
The "Mean Average Precision" (sometimes abbreviated mAP or MAP) might be what you want. It's pretty commonly used for evaluating information retrieval systems and is fairly straightforward to compute.
First, calculate the average precision for a given query. To do this, rank the documents and compute the precision after retrieving each relevant document. For example, suppose that four documents are relevant to this query, and our system returned the following:
- Relevant document
- Irrelevant document
- Relevant document
- Relevant document
- Irrelevant document
- Irrelevant document.
- Relevant document
The first relevant document is at position one, and the precision there is 1/1 = 1.0
The next relevant document is at position 3; two of the three documents seen so far are relevant, so our precision here is 2/3. Document 4 is relevant too and the precision score here is 3/4. The final relevant item is at position seven, giving us a precision of 4/7.
Find the mean of these precision scores (1/4*(1 + 2/3 + 3/4 + 4/7) = ~0.747) to get the average precision for this query. The mean average precision is just the mean of these averages across all the queries in your evaluation set.
As for choosing a precision-recall trade off, that's largely up to you. The $F_1$ score gives them equal weight; you can interpret the $\beta$ in $F_\beta$ as giving $\beta$ times more weight to recall than precision. I believe that some studies indicate that users prefer precision to recall, but I would bet that it depends a lot on the application and use-case. I certainly don't need google to show me every webpage about cats, but do want all the sites on the first page to be relevant. On the flip side, it might be more important to return every possibly-relevant document if you're doing discovery for a court case.
Best Answer
I wrote a function for this purpose, based on the exercise in the book Data Mining with R:
Where for any binary classification task, this returns the precision, recall, and F-stat for each classification and the overall accuracy like so:
Calculating the P, R, and F for each class like this lets you see whether one or the other is giving you more difficulty, and it's easy to then calculate the overall P, R, F stats. I haven't used the ROCR package, but you could easily derive the same ROC curves by training the classifier over the range of some parameter and calling the function for classifiers at points along the range.