Unsupervised Adversarial Invariance

September 26, 2018


Representation learning is a key ingredient for machine learning algorithms and its effectiveness and efficiency has a direct correlation to performance of an algorithm. Supervised algorithms involve learning a mapping from a data sample x to a target variable y by estimating a conditional probability p(y|x) from the data. In the real world, often the data sample x is itself composed of many nuisance factors and noises that are irrelevant to the prediction of y and could lead to overfitting of the model. For example, a nuisance factor in the case of face recognition is images is the lighting condition the photograph was captured in, and a recognition model that associates lighting with subject identity is expected to perform poorly. Developing a supervised learning model that is invariant to noise and nuisances have been a long-standing problem in the machine learning community.

Some of the previous approaches to handle this problem have been feature selection, augmentation and invariance induction. Popular feature selection methods incorporate information theoretic measures or use supervised methods to score features with their importance for the prediction task and prune the low-scoring ones. The method we propose can be interpreted as an implicit feature selection mechanism for neural networks, which can work on both raw data such as images and features-set like frequency features computed from textual data.

We propose a generalized framework for unsupervised induction of invariance to nuisance factors by disentangling information required for predicting y from the other unrelated information contained in x through incorporation of data reconstruction as a competing task for the primary prediction task and a disentanglement term in the training objective. This is achieved by splitting the information contained in the data sample x into two representations e1 and e2 both of which together contain the total information for the reconstruction of x but all the necessary information for the prediction task is pulled into e1 while the other nuisance factors and noise is pulled into e2. The prediction and reconstruction tasks in our framework are designed to compete and this competition leads to the splitting of information into e1 and e2.

The framework introduced in this paper is quite general and can be applied to any supervised learning task. The figure above shows the t-SNE visualization of the representations learned using our method (a) vs a standard information bottleneck (b). We see that the features are more clustered together with the adversarial invariance framework. The proposed method is not designed to learn “fair representation” of data, e.g., making prediction about the savings of a person invariant to age, when such bias exists in data and making the prediction task invariant to such biasing factors is of higher priority than the prediction performance.