I wonder how to compute precision and recall using a confusion matrix for a multi-class classification problem. Specifically, an observation can only be assigned to its most probable class / label. I would like to compute:
- Precision = TP / (TP+FP)
- Recall = TP / (TP+FN)
for each class, and then compute the micro-averaged F-measure.
Best Answer
In a 2-hypothesis case, the confusion matrix is usually:
where I've used something similar to your notation:
From the raw data, the values in the table would typically be the counts for each occurrence over the test data. From this, you should be able to compute the quantities you need.
Edit
The generalization to multi-class problems is to sum over rows / columns of the confusion matrix. Given that the matrix is oriented as above, i.e., that a given row of the matrix corresponds to specific value for the "truth", we have:
$\text{Precision}_{~i} = \cfrac{M_{ii}}{\sum_j M_{ji}}$
$\text{Recall}_{~i} = \cfrac{M_{ii}}{\sum_j M_{ij}}$
That is, precision is the fraction of events where we correctly declared $i$ out of all instances where the algorithm declared $i$. Conversely, recall is the fraction of events where we correctly declared $i$ out of all of the cases where the true of state of the world is $i$.