Invariant Representations through Adversarial Forgetting

November 20, 2019


When learning to correlate a target with underlying aspects of data that are informative of the target, supervised machine learning models frequently learn to associate irrelevant factors. And overfitting occurs when erroneous links between the target and nuisance elements are learned.

Biasing factors, on the other hand, may cause models to learn connections between the goal and factors that are linked with the target only in collected training data. Because of such biasing factors, trained models can be unfair to groups that are under-represented in the training data, raising ethical and legal issues. As a result, it’s critical to create models that are immune to both nuisance and biasing effects.

Training using data augmentation by minor perturbations to real data with respect to undesired factors has been a popular strategy for obtaining invariance to undesired factors. While these methods have been used to train deep neural networks (DNNs), new methods have evolved in recent years that eliminate undesired components from the latent representations, penalizing the model for the existence of undesired factors through information bottleneck.

Due to their strategy of invariance by exclusion rather than inclusion, these approaches outperform data augmentation. However, because the bottleneck goal is difficult to optimize, it has previously been approximated using variational inference or implemented as Information Dropout.


The figure above depicts the main concept underlying our method. We propose a new adversarial forgetting technique that uses a forget-mask to filter out undesired factors’ information while maintaining information about the target, resulting in a new latent representation that is maximally informative of the target but invariant to undesired biasing factors.

Inspired by the discovery of richer features in DNN classifiers upon augmentation with reconstruction objectives, and the forgetting operation in Long Short-Term Memory (LSTM) cells, the proposed framework adopts the idea of “discovery and separation of information” for invariance, which is fundamentally different from the direct removal of undesired factors as in DANN (Domain-Adversarial Neural Network) and CAI (Controllable Adversarial Invariance). The forget-gate used in the proposed framework serves as an information bottleneck, and adversarial training drives the creation of forget-masks that eliminate undesirable elements.

Experimental Results

When compared to Neural Network (NN) + Maximum Mean Discrepancy (MMD), Variational Fair Autoencoder (VFAE), Controllable Adversarial Invariance (CAI), Conditional Variational Information Bottleneck (CVIB), and Unsupervised Adversarial Invariance framework (UAI), as shown in the tables above, the reported results show that the proposed model achieves state-of-the-art performance in both nuisance and biasing factors across a wide range of datasets and tasks. Besides quantitative results, the t-SNE plot of z and z~ in the figure below provides a more direct visualization that the invariant z~  displays no grouping by s, whereas z reveals obvious s-subgroups inside each y-cluster, as illustrated in the figure below.