Abstract
Real-world large-scale medical image analysis (MIA) datasets have three
challenges: 1) they contain noisy-labelled samples that affect training
convergence and generalisation, 2) they usually have an imbalanced distribution
of samples per class, and 3) they normally comprise a multi-label problem,
where samples can have multiple diagnoses. Current approaches are commonly
trained to solve a subset of those problems, but we are unaware of methods that
address the three problems simultaneously. In this paper, we propose a new
training module called Non-Volatile Unbiased Memory (NVUM), which
non-volatility stores running average of model logits for a new regularization
loss on noisy multi-label problem. We further unbias the classification
prediction in NVUM update for imbalanced learning problem. We run extensive
experiments to evaluate NVUM on new benchmarks proposed by this paper, where
training is performed on noisy multi-label imbalanced chest X-ray (CXR)
training sets, formed by Chest-Xray14 and CheXpert, and the testing is
performed on the clean multi-label CXR datasets OpenI and PadChest. Our method
outperforms previous state-of-the-art CXR classifiers and previous methods that
can deal with noisy labels on all evaluations. Our code is available at
https://github.com/FBLADL/NVUM.