Abstract
Unsupervised anomaly detection (UAD) aims to find anomalous images by
optimising a detector using a training set that contains only normal images.
UAD approaches can be based on reconstruction methods, self-supervised
approaches, and Imagenet pre-trained models. Reconstruction methods, which
detect anomalies from image reconstruction errors, are advantageous because
they do not rely on the design of problem-specific pretext tasks needed by
self-supervised approaches, and on the unreliable translation of models
pre-trained from non-medical datasets. However, reconstruction methods may fail
because they can have low reconstruction errors even for anomalous images. In
this paper, we introduce a new reconstruction-based UAD approach that addresses
this low-reconstruction error issue for anomalous images. Our UAD approach, the
memory-augmented multi-level cross-attentional masked autoencoder (MemMC-MAE),
is a transformer-based approach, consisting of a novel memory-augmented
self-attention operator for the encoder and a new multi-level cross-attention
operator for the decoder. MemMCMAE masks large parts of the input image during
its reconstruction, reducing the risk that it will produce low reconstruction
errors because anomalies are likely to be masked and cannot be reconstructed.
However, when the anomaly is not masked, then the normal patterns stored in the
encoder's memory combined with the decoder's multi-level cross attention will
constrain the accurate reconstruction of the anomaly. We show that our method
achieves SOTA anomaly detection and localisation on colonoscopy, pneumonia, and
covid-19 chest x-ray datasets.