Abstract
Solution for addressing real-world image classification challenges. Human-AI collaborative classification (HAI-CC) aims to synergise the efficiency of machine learning classifiers and the reliability of human experts to support decision making. Learning to defer (L2D) has been one of the promising HAI-CC approaches, where the system assesses a sample and decides to defer to one of human experts when it is not confident. Despite recent progress, existing L2D methods rely on the strong assumption of ground truth label availability for training, while in practice, most datasets often contain multiple noisy annotations per data sample without well-curated ground truth labels. In addition, current L2D methods either consider the setting of a single human expert or defer the decision to one human expert, even though there may be multiple experts available, resulting in a suboptimal utilisation of available resources. Furthermore, current HAI-CC evaluation frameworks often overlook processing costs, making it difficult to assess the trade-off between computational efficiency and performance when bench-marking different methods. To address these gaps, this paper introduces LECOMH-a new HAI-CC method that learns from noisy labels without depending on clean labels for training, simultaneously maximising collaborative accuracy with either one or multiple human experts, while minimising the cost of human collaboration. The paper also introduces benchmarks featuring multiple noisy labels per data sample for both training and testing to evaluate HAI-CC methods. Through quantitative comparisons on these benchmarks, LECOMH consistently outperforms HAI-CC methods and baselines, including human experts alone, multi-rater learning and noisy-label learning methods across both synthetic and real-world datasets.