Abstract
Methods to detect malignant lesions from screening mammograms are usually
trained with fully annotated datasets, where images are labelled with the
localisation and classification of cancerous lesions. However, real-world
screening mammogram datasets commonly have a subset that is fully annotated and
another subset that is weakly annotated with just the global classification
(i.e., without lesion localisation). Given the large size of such datasets,
researchers usually face a dilemma with the weakly annotated subset: to not use
it or to fully annotate it. The first option will reduce detection accuracy
because it does not use the whole dataset, and the second option is too
expensive given that the annotation needs to be done by expert radiologists. In
this paper, we propose a middle-ground solution for the dilemma, which is to
formulate the training as a weakly- and semi-supervised learning problem that
we refer to as malignant breast lesion detection with incomplete annotations.
To address this problem, our new method comprises two stages, namely: 1)
pre-training a multi-view mammogram classifier with weak supervision from the
whole dataset, and 2) extending the trained classifier to become a multi-view
detector that is trained with semi-supervised student-teacher learning, where
the training set contains fully and weakly-annotated mammograms. We provide
extensive detection results on two real-world screening mammogram datasets
containing incomplete annotations, and show that our proposed approach achieves
state-of-the-art results in the detection of malignant breast lesions with
incomplete annotations.