Abstract
In recent years, there has been a rise in the popularity of robotic systems used to inspect large infrastructures. Traditionally, the footage collected by such systems requires manual monitoring and analysis by professionals to detect abnormalities. However, this process can be very time-consuming and expensive. Therefore, research activities in automatic visual anomaly detection can have great practical significance in reducing the cost and difficulty of inspection and allowing for a more continuous inspection of infrastructures.
Deep learning visual anomaly detection has achieved state-of-the-art performance on various image and video anomaly detection tasks within research settings. However, applying anomaly detection models to real-world scenarios remains challenging. Real-world data is often noisy, unstructured and diverse, which can cause high false-positive rates and poor generalisation in new environments. Anomalies can also be subtle or context-dependent, which current deep learning models struggle to understand. Lastly, the black-box nature of the models and lack of interpretability and transparency can be a challenge for regulators. To overcome these challenges, this thesis introduces a novel set of anomaly detection models based on modular neural networks and graph neural networks. The results indicate that these models are promising avenues for overcoming the stated challenges while maintaining high-accuracy results.