Abstract
Constructing effective representation of lesions is essential for disease classification and localization in medical image analysis. Prototype-based models address this by leveraging visual prototypes to capture representative lesion patterns, yet effectively handling the complexity of diverse lesion characteristics remains a critical challenge, as they typically rely on single-level, fixedsize prototypes and suffer from prototype redundancy. In this paper, we present HierProtoPNet, a new prototypebased framework designed to handle the complexity of lesions in medical images. HierProtoPNet leverages hierarchical visual prototypes across different semantic feature granularities to effectively capture diverse lesion patterns. To prevent redundancy and increase utility of the prototypes, we devise a novel prototype mining paradigm to progressively discover semantically distinct prototypes, offering multi-level complementary analysis of lesions. Also, we introduce a dynamic knowledge distillation strategy that allows transferring essential classification information across hierarchical levels, thereby improving generalisation performance. Comprehensive experiments show that HierProtoPNet achieves state-of-the-art classification performances in three benchmarks: binary breast cancer screening, multi-class retinal disease diagnosis, and multilabel chest X-ray classification. Quantitative assessments also illustrate HierProtoPNet's significant advantages in weakly-supervised disease localisation and segmentation.