Solved – Calculate precision and recall

machine learningprecision-recall

I am really confused about how to calculate precision and recall in supervised machine learning algorithm using NB classifier with more than two classes.

Say for example

  1. I have three classes $A$, $B$, & $C$
  2. I have $10000$ Documents out of which $2000$ goes to training sample set (class $A=500$, class $B=1000$, class $C=500$)
  3. Now on basis of above training sample set classify rest $8000$ documents using NB classifier
  4. After classifying, $1000$ documents goes to class $A$ and $6000$ documents goes to class $B$ and $1000$ documents goes to $C$
  5. Now how to calculate precision and recall for all individual classes?

I figured out precision and recall for two classes here it goes

Say suppose there are two classes $A$, $B$

Now when a test is executed for documents labeled as $A$ there are two possible classifications for each document: if the classification is $A$, add 1 to “true A” (TA), if the classification is $B$ add 1 to “false B” (FB). Similarly for $B$: if the classification is $A$, add 1 to “false A” (FA) and if classification is B add 1 to “true B” (TB).

I want the same above situation when there are more than two classes

Best Answer

The logic remains the same for several classes, to wit

  • If a document belonging to A…

    • is classified as A, it's a true positive/true A
    • is classified as B, it's a false positive for B/false B and a false negative for A
    • is classified as C, it's a false positive for C/false C and a false negative for A
  • If a document belonging to B…

    • is classified as A, it's a false positive for A/false A and a false negative for B
    • is classified as B, it's a true positive for B
    • is classified as C, it's a false positive for C/false C and a false negative for B
  • etc.

Precision for A is true positives/(true positives + false positives) where “false positives” are the false positives from all other classes (i.e. the B documents classified as A + the C documents classified as A, etc.).

Recall for A is true positives/(true positives + false negatives) where “false negatives” are all the A documents not classified as A (i.e. the A documents classified as B + the A documents classified as C, etc.) or, equivalently, the total number of A documents minus the number of true positives.

You can also look at all this as a series of confusion matrices with two categories: One with A and non-A (so B and C together), one with B and non-B and finally one with C and non-C.

Most informative is to report precision and recall for each category (especially if you have just a few) but I have seen people combine them in a F1 score and average across categories to obtain some sort of overall performance measure.