I was pondering about the same, and I solved it like this. Suppose you have a co-occurrence matrix/contingency table where the rows are the ground truth clusters, and the columns are the clusters found by the clustering algorithm.
So, for the example in the book, it would look like:
| 1 | 2 | 3
--+---+---+---
x | 5 | 1 | 2
--+---+---+---
o | 1 | 4 | 0
--+---+---+---
◊ | 0 | 1 | 3
Now, you can very easily compute the TP + FP by taking the sum per column and 'choose 2' over all those values. So the sums are [6, 6, 5] and you do '6 choose 2' + '6 choose 2' + '5 choose 2'.
Now, indeed, similarly, you can get TP + FN by taking the sum over the rows (so, that is [8, 5, 4] in the example above), apply 'choose 2' over all those values, and take the sum of that.
The TP's themselves can be calculated by applying 'choose 2' to every cell in the matrix and taking the sum of everything (assuming that '1 choose 2' is 0).
In fact, here is some Python code that does exactly that:
import numpy as np
from scipy.misc import comb
# There is a comb function for Python which does 'n choose k'
# only you can't apply it to an array right away
# So here we vectorize it...
def myComb(a,b):
return comb(a,b,exact=True)
vComb = np.vectorize(myComb)
def get_tp_fp_tn_fn(cooccurrence_matrix):
tp_plus_fp = vComb(cooccurrence_matrix.sum(0, dtype=int),2).sum()
tp_plus_fn = vComb(cooccurrence_matrix.sum(1, dtype=int),2).sum()
tp = vComb(cooccurrence_matrix.astype(int), 2).sum()
fp = tp_plus_fp - tp
fn = tp_plus_fn - tp
tn = comb(cooccurrence_matrix.sum(), 2) - tp - fp - fn
return [tp, fp, tn, fn]
if __name__ == "__main__":
# The co-occurrence matrix from example from
# An Introduction into Information Retrieval (Manning, Raghavan & Schutze, 2009)
# also available on:
# http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html
#
cooccurrence_matrix = np.array([[ 5, 1, 2], [ 1, 4, 0], [ 0, 1, 3]])
# Get the stats
tp, fp, tn, fn = get_tp_fp_tn_fn(cooccurrence_matrix)
print "TP: %d, FP: %d, TN: %d, FN: %d" % (tp, fp, tn, fn)
# Print the measures:
print "Rand index: %f" % (float(tp + tn) / (tp + fp + fn + tn))
precision = float(tp) / (tp + fp)
recall = float(tp) / (tp + fn)
print "Precision : %f" % precision
print "Recall : %f" % recall
print "F1 : %f" % ((2.0 * precision * recall) / (precision + recall))
If I run it I get:
$ python testCode.py
TP: 20, FP: 20, TN: 72, FN: 24
Rand index: 0.676471
Precision : 0.500000
Recall : 0.454545
F1 : 0.476190
I actually didn't check any other examples than this one, so I hope I did it right.... ;-)
Best Answer
Basically:
Using wikipedia, we have the formula:
$ARI = \frac{ \sum_{ij} \binom{n_{ij}}{2} - [\sum_i \binom{a_i}{2} \sum_j \binom{b_j}{2}] / \binom{n}{2} }{ \frac{1}{2} [\sum_i \binom{a_i}{2} + \sum_j \binom{b_j}{2}] - [\sum_i \binom{a_i}{2} \sum_j \binom{b_j}{2}] / \binom{n}{2} }$
Let's assume we have the table:
Breaking into components:
So, then
$ARI = \frac{7 - 13*14/45}{(13 + 14)/2 - 13*14/45} = 0.313$
Confirming this result in R: