Solved – Precision and recall in a multi-class classification system

classificationprecision-recall

Having looked at Wikipedia and looked around here, I'm still not clear how to identify the true positives, true negatives, false positives and false negatives in the multi-label classification problem that I'm doing, so that I can calculate precision, recall and F-measure.

I have a few categories and a few hundred documents. Having trained an SVM to perform classification, I have the following data

document 1 | actual category, category assigned by SVM
document 2 | actual category, category assigned by SVM
...
document n | actual category, category assigned by SVM

The category assigned by the SVM is often the same as the actual category but not always. The true positives are when

actual category = category assigned by SVM

I guess false positives would be

actual category =/= category assigned by SVM

But I'm not sure about false positives or true negatives.

I feel like this may be a basic question but I can't find a clear (enough) example.

Best Answer

For those who might be interested in an answer, this comes from a much more knowledgeable source than me (PhD candidate in NLP):

When doing multiclass classification, precision and recall are really only properly defined for individual classes (you can average across classes to get a general scores for the entire system, but it's not really that useful; in my opinion, you're probably better off just using overall accuracy as your metric of performance).

For an individual class, the false positives are those instances which were classified as that class, but in fact aren't, and the true negatives are those instances which are not that class, and were indeed classified as not belonging to that class (regardless of whether they were correctly classified).