Abstract
Highlights What are the main findings? A pre-trained transformer model, fine-tuned with transfer learning, significantly improves fault detection in cyber-physical systems (CPSs) despite limited fault-labeled data. The proposed method achieves a high average F1-score of 93.38% on industrial CPS datasets, outperforming traditional CNN and LSTM models. What is the implication of the main finding? Transformer-based transfer learning enables more reliable fault diagnostics in industrial CPS environments where data scarcity and domain shifts are common. The approach demonstrates practical scalability from controlled lab conditions to real-world industrial applications.Highlights What are the main findings? A pre-trained transformer model, fine-tuned with transfer learning, significantly improves fault detection in cyber-physical systems (CPSs) despite limited fault-labeled data. The proposed method achieves a high average F1-score of 93.38% on industrial CPS datasets, outperforming traditional CNN and LSTM models. What is the implication of the main finding? Transformer-based transfer learning enables more reliable fault diagnostics in industrial CPS environments where data scarcity and domain shifts are common. The approach demonstrates practical scalability from controlled lab conditions to real-world industrial applications.Abstract As industries become increasingly dependent on cyber-physical systems (CPSs), failures within these systems can cause significant operational disruptions, underscoring the critical need for effective Prognostics and Health Management (PHM). The large volume of data generated by CPSs has made deep learning (DL) methods an attractive solution; however, imbalanced datasets and the limited availability of fault-labeled data continue to hinder their effective deployment in real-world applications. To address these challenges, this paper proposes a transfer learning approach using a pre-trained transformer architecture to enhance fault detection performance in CPSs. A streamlined transformer model is first pre-trained on a large-scale source dataset and then fine-tuned end-to-end on a smaller dataset with a differing data distribution. This approach enables the transfer of diagnostic knowledge from controlled laboratory environments to real-world operational settings, effectively addressing the domain shift challenge commonly encountered in industrial CPSs. To evaluate the effectiveness of the proposed method, extensive experiments are conducted on publicly available datasets generated from a laboratory-scale replica of a modern industrial water purification facility. The results show that the model achieves an average F1-score of 93.38% under K-fold cross-validation, outperforming baseline models such as CNN and LSTM architectures, and demonstrating the practicality of applying transformer-based transfer learning in industrial settings with limited fault data. To enhance transparency and better understand the model's decision process, SHAP is applied for explainable AI (XAI).