Abstract
Recent advances in prototypical learning have shown remarkable potential to
provide useful decision interpretations associating activation maps and
predictions with class-specific training prototypes. Such prototypical learning
has been well-studied for various single-label diseases, but for quite relevant
and more challenging multi-label diagnosis, where multiple diseases are often
concurrent within an image, existing prototypical learning models struggle to
obtain meaningful activation maps and effective class prototypes due to the
entanglement of the multiple diseases. In this paper, we present a novel Cross-
and Intra-image Prototypical Learning (CIPL) framework, for accurate
multi-label disease diagnosis and interpretation from medical images. CIPL
takes advantage of common cross-image semantics to disentangle the multiple
diseases when learning the prototypes, allowing a comprehensive understanding
of complicated pathological lesions. Furthermore, we propose a new two-level
alignment-based regularisation strategy that effectively leverages consistent
intra-image information to enhance interpretation robustness and predictive
performance. Extensive experiments show that our CIPL attains the
state-of-the-art (SOTA) classification accuracy in two public multi-label
benchmarks of disease diagnosis: thoracic radiography and fundus images.
Quantitative interpretability results show that CIPL also has superiority in
weakly-supervised thoracic disease localisation over other leading saliency-
and prototype-based explanation methods.