LongReMix: Robust learning with high confidence samples in a noisy label environment

Filipe R. Cordeiro; Ragav Sachdeva; Vasileios Belagiannis; Ian Reid; Gustavo Carneiro

doi:10.1016/j.patcog.2022.109013

Back

LongReMix: Robust learning with high confidence samples in a noisy label environment

Journal article

Peer reviewed

LongReMix: Robust learning with high confidence samples in a noisy label environment

Filipe R. Cordeiro, Ragav Sachdeva, Vasileios Belagiannis, Ian Reid and Gustavo Carneiro

Pattern recognition, Vol.133, p.109013

01/01/2023

DOI: https://doi.org/10.1016/j.patcog.2022.109013

Abstract

Deep learning

Empirical vicinal risk

Noisy label learning

Semi-supervised learning

•We propose a new two-stage noisy-label learning algorithm, called LongReMix.•The first stage finds a highly precise, but potentially small, set of clean samples.•The second stage is designed to be robust to small sets of clean samples.•LongReMix reaches SOTA performance on the main noisy-label learning benchmarks. State-of-the-art noisy-label learning algorithms rely on an unsupervised learning to classify training samples as clean or noisy, followed by a semi-supervised learning (SSL) that minimises the empirical vicinal risk using a labelled set formed by samples classified as clean, and an unlabelled set with samples classified as noisy. The classification accuracy of such noisy-label learning methods depends on the precision of the unsupervised classification of clean and noisy samples, and the robustness of SSL to small clean sets. We address these points with a new noisy-label training algorithm, called LongReMix, which improves the precision of the unsupervised classification of clean and noisy samples and the robustness of SSL to small clean sets with a two-stage learning process. The stage one of LongReMix finds a small but precise high-confidence clean set, and stage two augments this high-confidence clean set with new clean samples and oversamples the clean data to increase the robustness of SSL to small clean sets. We test LongReMix on CIFAR-10 and CIFAR-100 with introduced synthetic noisy labels, and the real-world noisy-label benchmarks CNWL (Red Mini-ImageNet), WebVision, Clothing1M, and Food101-N. The results show that our LongReMix produces significantly better classification accuracy than competing approaches, particularly in high noise rate problems. Furthermore, our approach achieves state-of-the-art performance in most datasets. The code is available at https://github.com/filipe-research/LongReMix.

Metrics

Details

Title: LongReMix: Robust learning with high confidence samples in a noisy label environment
Creators: Filipe R. Cordeiro - Universidade Federal Rural de Pernambuco
Ragav Sachdeva - University of Oxford
Vasileios Belagiannis - Otto-von-Guericke University Magdeburg
Ian Reid - Australian Centre for Robotic Vision
Gustavo Carneiro - University of Surrey
Publication Details: Pattern recognition, Vol.133, p.109013
Publisher: Elsevier Ltd
Publication Date: 01/01/2023
Identifiers: 99783169702346; WOS:000865435300008
Academic Unit: School of Computer Science and Electronic Engineering
Language: English
Resource Type: Journal article

LongReMix: Robust learning with high confidence samples in a noisy label environment

Abstract

Metrics

Details

Usage Policy