Solved – Precision and recall in a multi-class classification system

classificationprecision-recall

Having looked at Wikipedia and looked around here, I'm still not clear how to identify the true positives, true negatives, false positives and false negatives in the multi-label classification problem that I'm doing, so that I can calculate precision, recall and F-measure.

I have a few categories and a few hundred documents. Having trained an SVM to perform classification, I have the following data

document 1 | actual category, category assigned by SVM
document 2 | actual category, category assigned by SVM
...
document n | actual category, category assigned by SVM

The category assigned by the SVM is often the same as the actual category but not always. The true positives are when

actual category = category assigned by SVM

I guess false positives would be

actual category =/= category assigned by SVM

But I'm not sure about false positives or true negatives.

I feel like this may be a basic question but I can't find a clear (enough) example.

Best Answer

For those who might be interested in an answer, this comes from a much more knowledgeable source than me (PhD candidate in NLP):

When doing multiclass classification, precision and recall are really only properly defined for individual classes (you can average across classes to get a general scores for the entire system, but it's not really that useful; in my opinion, you're probably better off just using overall accuracy as your metric of performance).

For an individual class, the false positives are those instances which were classified as that class, but in fact aren't, and the true negatives are those instances which are not that class, and were indeed classified as not belonging to that class (regardless of whether they were correctly classified).

Related Solutions

Precision and Recall Values When Denominator Equals Zero

The answers to the linked earlier question apply here too.

If (true positives + false negatives) = 0 then no positive cases in the input data, so any analysis of this case has no information, and so no conclusion about how positive cases are handled. You want N/A or something similar as the ratio result, avoiding a division by zero error

If (true positives + false positives) = 0 then all cases have been predicted to be negative: this is one end of the ROC curve. Again, you want to recognise and report this possibility while avoiding a division by zero error.

How to Calculate Precision and Recall for Multiclass Classification Using Confusion Matrix

In a 2-hypothesis case, the confusion matrix is usually:

	Declare H1	Declare H0
Is H1	TP	FN
Is H0	FP	TN

where I've used something similar to your notation:

TP = true positive (declare H1 when, in truth, H1),
FN = false negative (declare H0 when, in truth, H1),
FP = false positive
TN = true negative

From the raw data, the values in the table would typically be the counts for each occurrence over the test data. From this, you should be able to compute the quantities you need.

Edit

The generalization to multi-class problems is to sum over rows / columns of the confusion matrix. Given that the matrix is oriented as above, i.e., that a given row of the matrix corresponds to specific value for the "truth", we have:

$\text{Precision}_{~i} = \cfrac{M_{ii}}{\sum_j M_{ji}}$

$\text{Recall}_{~i} = \cfrac{M_{ii}}{\sum_j M_{ij}}$

That is, precision is the fraction of events where we correctly declared $i$ out of all instances where the algorithm declared $i$. Conversely, recall is the fraction of events where we correctly declared $i$ out of all of the cases where the true of state of the world is $i$.

Best Answer

Related Solutions

Precision and Recall Values When Denominator Equals Zero

How to Calculate Precision and Recall for Multiclass Classification Using Confusion Matrix

Related Question