Abstract
Noisy-labels are challenging for deep learning due to the high capacity of
the deep models that can overfit noisy-label training samples. Arguably the
most realistic and coincidentally challenging type of label noise is the
instance-dependent noise (IDN), where the labelling errors are caused by the
ambivalent information present in the images. The most successful label noise
learning techniques to address IDN problems usually contain a noisy-label
sample selection stage to separate clean and noisy-label samples during
training. Such sample selection depends on a criterion, such as loss or
gradient, and on a curriculum to define the proportion of training samples to
be classified as clean at each training epoch.
Even though the estimated noise rate from the training set appears to be a
natural signal to be used in the definition of this curriculum, previous
approaches generally rely on arbitrary thresholds or pre-defined selection
functions to the best of our knowledge. This paper addresses this research gap
by proposing a new noisy-label learning graphical model that can easily
accommodate state-of-the-art (SOTA) noisy-label learning methods and provide
them with a reliable noise rate estimate to be used in a new sample selection
curriculum. We show empirically that our model integrated with many SOTA
methods can improve their results in many IDN benchmarks, including synthetic
and real-world datasets.