Algorithmic Fairness – Why Classifiers Cannot Be Well-Calibrated and Achieve Error Rate Balance Across Groups

algorithmic-fairnesscalibrationmachine learning

There are several results in the literature now, stating that a classifier cannot fulfill calibration and error rate balance at the same time if there are actual differences between groups. To pick one exemplary result, Kleinberg et al. (2016) derive that the following three conditions can only be fulfilled simultaneously if there are no actual differences between the groups (copying verbatim from their paper in the following):

  1. Calibration within groups, i.e., for each group t, and each bin b with associated score vb, the
    expected number of people from group t in b who belong to the positive class should be a vb fraction
    of the expected number of people from group t assigned to b.
  2. Balance for the negative class, i.e., the average score assigned to people of group 1 who
    belong to the negative class should be the same as the average score assigned to people of group 2
    who belong to the negative class.
  3. Balance for the positive class, i.e., the average score assigned to people of
    group 1 who belong to the positive class should be the same as the average score assigned to people
    of group 2 who belong to the positive class.

I can follow their derivation (and other, similar ones), but I am still missing an intuition for why I cannot have a well-calibrated classifier that achieves error rate balance in any non-trivial case? Why are these two requirements contradictory? It seems to have something to do with the fact that if the base rates differ between groups, one cannot have true positive rate, false positive rate, positive predictive value, and negative predictive value all be equal across groups. (See the Fair ML book, p. 56-57.) But it still can't wrap my head around why – intuitively – that is not possible. Maybe someone has a nice illustrative example or can otherwise provide intuition?

Best Answer

The essential intuition for why calibration by groups and separation (=balance for the positive/negative classes) are incompatible is that the average score of a calibrated classifier within each group is equal to the base rate of that group, $p(y|x, \text{group}=i)$. From this, it is already almost apparent that equal average risk scores in the positive/negative classes of each group cannot be achieved, if there are base rate differences (and the classifier is not perfect).

More formally, from the above insight, one can derive that calibrated classifiers lie on straight lines in the following diagram for the different groups, and one would need the lines for the different groups to intersect - which can only happen if the classifier is either perfect or if there are no base rate differences between the groups.

enter image description here

Crucially, error rate balance (=equal TPR, FPR) is not the same as separation, and it is in principle possible to achieve error rate balance and calibration by group at the same time. (I provide an example of this on my blog, see link below.) To achieve exact error rate balance, the ROC curves for the different groups would have to intersect, however, which is unlikely to be the case for any practical application.

I just summarized all of this in more detail in a post on my personal blog.