Abstract
Many state-of-the-art noisy-label learning methods rely on learning
mechanisms that estimate the samples' clean labels during training and discard
their original noisy labels. However, this approach prevents the learning of
the relationship between images, noisy labels and clean labels, which has been
shown to be useful when dealing with instance-dependent label noise problems.
Furthermore, methods that do aim to learn this relationship require cleanly
annotated subsets of data, as well as distillation or multi-faceted models for
training. In this paper, we propose a new training algorithm that relies on a
simple model to learn the relationship between clean and noisy labels without
the need for a cleanly labelled subset of data. Our algorithm follows a 3-stage
process, namely: 1) self-supervised pre-training followed by an early-stopping
training of the classifier to confidently predict clean labels for a subset of
the training set; 2) use the clean set from stage (1) to bootstrap the
relationship between images, noisy labels and clean labels, which we exploit
for effective relabelling of the remaining training set using semi-supervised
learning; and 3) supervised training of the classifier with all relabelled
samples from stage (2). By learning this relationship, we achieve
state-of-the-art performance in asymmetric and instance-dependent label noise
problems.