Information-Theoretic Bias Assessment Of Learned Representations Of Pretrained Face Recognition
Introduction
As equality issues in the use of face recognition have garnered a lot of attention lately, greater efforts have been made to debiased deep learning models to improve fairness to minorities. However, there is still no clear definition nor sufficient analysis for bias assessment metrics. Therefore, we propose an information-theoretic, independent bias assessment metric to identify degree of bias against protected demographic attributes from learned representations of pretrained facial recognition systems.
Most prior work mainly focus on bias mitigation instead of bias assessment, particularly neglecting representation/embedding level bias. Further, most of the proposed bias assessment metrics are rarely used to evaluate other debiased models, as a mean to show universality. To the best of our knowledge, there is no universal way to assess the degree of bias for existing debiased models as these models have been evaluated on different benchmarks under varying conditions.
Without loss of generality, three main approaches for bias assessment are based on either (1) classification accuracy across cohorts, (2) information leakage from protected attributes to prediction logits of labels, or (3) estimated correlations between prediction logits and protected attributes using shallow classifiers. However, classification accuracy-based bias assessment metrics may not be accurate because accuracy across cohorts must be listed and compared together. Moreover, information leakage-based methods, such as Demographic Parity, Equality of Odds, and Equality of Opportunity strictly define fairness of a classification model as the independence between protected attributes and prediction logits of labels, which may not be appropriate to compare debiased models directly. Besides, quantitative metrics, relying on estimated correlations using a shallow predictor, may find correlations even in unbiased data and would then mistakenly identify them as biased. To overcome such drawbacks, adopting a precise, universally applicable and acceptable metric for the degree of bias is both intrinsically difficult and important.
A brief summary of different metrics are shown in TABLE I. Specifically, predominant bias assessments rely on cross-cohort-terms based on accuracy, e.g. classification accuracy (Alvi et al. (2018)). In addition, as the representative of information-leakage-based metrics, Zhao et al. (2017) defined bias amplification as the difference of bias score (i.e. the percentage of occurrences of a given outcome and a demographic variable in the corpus) between training data and testing data. Besides, for estimated correlation-based metrics, Li et al. (2019) proposed dataset bias to capture the bias of a dataset, measured by the classification performance with the cross entropy loss. Moreover, Adeli et al. (2021) used distance correlation to assess the bias at the representation level.

TABLE I: Taxonomy of different bias metrics.
Approach

Fig.1: Illustration of Representation-Level Bias (RLB).
We use entropy to assess dataset bias and mutual information to assess model bias from learned representations extracted by backbone models, rather than simply establishing correlations by training a shallow predictor using logits, as illustrated in Fig.1. Entropy is empirically estimated by Monte Carlo Method. On the other hand, mutual information is estimated using the pipeline, as shown in Fig.2. We combine dataset bias and representation-level model bias to comprehensively assess the percentage of remaining bias after a debiased backbone model, given the overall dataset bias. In other words, large remaining bias represents inferior debiasing performance. In this respect, our method can also help assess the bias using representations/embeddings in a layer-by-layer fashion inside any model.

Fig.2: Pipeline of mutual information estimation.
Experimental Results
Based on Fig.2 and Fig.3, RLB declines as (1) the standard deviation of assigned color range increases in Colored MNIST, (2) the percentage of female or black race leaves the balanced point in FairFace dataset, and (3) the entropy of gender or race increases in synthetic dataset. Therefore, RLB is consistent with the degree of imbalance.

Fig.2: Verification experiments on Colored MNIST Dataset.

Fig.3: Representation-level bias for sex and race.
In TABLE II and TABLE II, there are two-dimension comparison. Debiasing models are compared across rows and metrics are compared across columns. Furthermore, with Fig.4, the advantages (clear discrepancy and small variation) of RLB demonstrate a better precision.

Fig.4: Debiasing performance comparison of debiasing models on CelebA Dataset using box plots.
TABLE II: Debiasing performance comparison of debiasing models on CelebA Dataset.

TABLE III: Debiasing performance comparison of debiasing models on FairFace Dataset.
