MUSCLE: Strengthening Semi-Supervised Learning Via Concurrent Unsupervised Learning Using Mutual Information Maximization

November 30, 2020

Introduction

In recent years, Deep Neural Networks (DNN) have achieved remarkable results on various computer vision tasks, such as image classification, object detection, and semantic segmentation. For achieving a satisfactory performance, conventional approaches typically require a large amount of labeled data. However, compared with the cost of merely collecting the data, the cost of obtaining the label for each sample can be expensive, so the application and scalability of DNN are heavily limited. This paper proposes a training scheme that focuses on the Semi-Supervised learning (SemiSL) problem, which mitigates such limitations by utilizing both labeled and unlabeled data. 

In the context of SemiSL, although the data sample is vastly available, only a small portion of them contain labels. Thus, the performance of traditional supervised objectives (e.g., cross-entropy loss) severely degrades due to failure to learn from the unlabeled data. Without loss of generality, conventional SemiSL methods can be categorized into two groups for better utilizing all data: consistency loss and pseudo labeling. Consistency loss [1, 2] aims to minimize the prediction or representation disagreement between variations of the same sample without the need of knowing its label. Pseudo labeling [3, 4] explicitly tries to assign labels to unlabeled data, such that the supervised learning objectives can be directly applied to the pseudo labels. Despite their success, a good amount of unbiased labeled data is still essential for the training, otherwise, the model will either fall to trivial solutions or fail to generate accurate pseudo labels. 

On the other hand, when the label information is completely missing, unsupervised learning (USL) methods have been developed to extract knowledge from the data without requiring any supervision signal. Work [5] has demonstrated the potential of combining USL with supervised learning objectives for SemiSL. Within those combinations, USL is applied to pretrain a model, where such model is either finetuned or frozen during supervised training. Thus, a synergy between USL and supervision signal is missing.

Approach

In this paper, we introduce Mutual-information-based Unsupervised & Semi-supervised Concurrent LEarning (MUSCLE) which shows concurrently using a USL objective along with a SemiSL model can achieve better performance. As shown in Figure 1, an image batch is composed of a set of labeled images and a set of unlabeled images. The image batch is separately augmented by two different augmentation methods where one is a weak augmentation and another one is a strong augmentation. Two sets of predictions of the two augmented image batches are respectively generated by a feature extractor and a classifier. Then, a USL objective [5] is adopted to maximize the mutual information between the two sets of predictions.

In parallel with the USL branch, as shown in Figure 1, the labeled portion of the image batch is additionally augmented and a set of predictions is generated by using the same feature extractor and classifier as used in the USL branch. Then, a supervised objective (e.g., cross-entropy loss) is applied to minimize the disagreement between the predictions and the ground truths. Furthermore, since the orientation of MUSCLE is orthogonal to the orientation of either consistency loss or pseudo labeling, approaches from both of them can be naturally merged into the training scheme of MUSCLE.

Figure 1: The training structure of MUSCLE

Experimental Results

We evaluated the performance of MUSCLE on three benchmark datasets: CIFAR10, CIFAR100, and Mini-Imagenet. In addition to applying MUSCLE alone, we also conducted experiments that combined MUSCLE with other state-of-the-art (SOTA) approaches. We selected the Mean-Teacher model [1] and the label propagation [4] model for consistency loss and pseudo labeling respectively. The results are shown in Table 1, Table 2, and  Table 3. Compared with baselines, models with MUSCLE consistently achieve better performance on all three benchmarks and all setups. Furthermore, it is also noticeable that as the number of labeled data decreases, the advantage of MUSCLE is better revealed.

We also evaluated the behavior of MUSCLE when combining it with FixMatch [6]. Although we were unable to incorporate FixMatch with the full training setup due to lack of sufficient computation resources, we tried two sets of hyper-parameters and trained the FixMatch model with and without MUSCLE. As demonstrated in Table 4 and Table 5, MUSCLE has the potential to benefit the training of FixMatch. 

[1]: Antti Tarvainen and Harri Valpola. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. NeurIPS, 2017.

[2]: Samuli Laine and Timo Aila. Temporal ensembling for semi-supervised learning. ICLR, 2017.

[3]: Dong-Hyun Lee. Pseudo-label : The simple and efficient semi-supervised learning method for deep neural networks. ICML 2013 Workshop : Challenges in Representation Learning (WREPL), 2013.

[4]: Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, and Ondrej Chum. Label propagation for deep semi-supervised learning. CVPR, 2019.

[5]: Xu Ji, Joao F Henriques, and Andrea Vedaldi. Invariant  information clustering for unsupervised image classification and segmentation. ICCV, 2019.

[6]: Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, and Colin Raffel. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. NeurIPS, 2020