Solved – Accuracy vs Jaccard for multiclass problem

accuracyjaccard-similaritymulti-classscikit learn

TL;DR For a multiclass problem, is Jaccard score the same as accuracy?


Update March 29, 2019

The wrong implementation in scikit-learn is now fixed with pull request #13151. Hooray!


P.S. The lesson here is that no matter how mature and widespread the library, framework or idea is, there are always bugs and shortcomings in them. It is up to you as an engineer, scientist or student to verify the theory and practical results of your work, especially if you rely on someone else's results.


I am working on classification problem and calculating accuracy and Jaccard score with scikit-learn which, I think, is a widely used library in pythonic scientific world. However, me and my matlab colleagues obtain different results.

sklearn.metrics.jaccard_similarity_score declares the following:

Notes: In binary and multiclass classification, this function is
equivalent to the accuracy_score. It differs in the multilabel
classification problem.

sklearn.metrics.accuracy_score says:

Notes In binary and multiclass classification, this function is equal
to the jaccard_similarity_score function.

Indeed, jaccard_similarity_score implementation falls back to accuracy if problem is not of multilabel type:

if y_type.startswith('multilabel'):
    ...
else:
    score = y_true == y_pred

return _weighted_sum(score, sample_weight, normalize)

Isn't it contradicts the definition of Jaccard index (intersection over union)? Are these "score" and "index" different metrics? What is the correct and commonly accepted way to calculate Jaccard metrics for a multiclass problem?

Best Answer

The issue has been reported on scikit-learn GitHub repository: multiclass jaccard_similarity_score should not be equal to accuracy_score #7332

scikit-learn's Jaccard score for the multiclass classification task is incorrect.


A neat overview of the most commonly used performance metrics from {1}:

enter image description here

The accuracy is $~\frac{\text{AA}+ \text{BB} +\text{CC} }{\text{AA}+ \text{AB} +\text{AC} + \text{BA} +\text{BB} + \text{BC} + \text{CA} +\text{CB}+\text{CC}}$.

The average Jaccard score a.k.a. average Jaccard coefficient is:

$~\frac{1}{3}\left(\frac{\text{AA}}{\text{AA}+ \text{AB} +\text{AC} + \text{BA} + \text{CA}} + \frac{\text{BB}}{\text{AB} +\text{BA} +\text{BB} + \text{BC} +\text{CB}} + \frac{\text{CC}}{\text{AC} + \text{BC} + \text{CA} +\text{CB}+\text{CC}}\right) $


For example, if the confusion matrix is:

enter image description here

Then:

  • the accuracy is $~\frac{1 + 0+ 0}{1 +0 +0 +1 +0 +0 +1 +0 +0 }=~\frac{1}{3}$
  • the average Jaccard score is $~\frac{1}{3}\left(\frac{1}{1 + 0+ 0+1 +1} + \frac{0}{0+1 +0 +0 +0} + \frac{0}{0 +0 +1 +0 +0}\right) = \frac{1}{9}$

References: