Abstract
The traditional supervised learning paradigm relies on large volumes of annotated data, which is often costly and labor-intensive to obtain, creating a major bottleneck in developing deep learning solutions. To overcome this limitation, we propose a novel self-learning model for failure classification in multivariate time-series data using a semi-supervised approach that combines unsupervised and supervised learning. Initially, an unsupervised method identifies normal and faulty patterns to pseudo-label a small dataset. A deep supervised learning model is then trained with these pseudo-labels, incorporating a confidence layer to assign prediction confidence scores. This enables iterative refinement and progressive construction of a labeled dataset from unlabeled data. Furthermore, transfer learning is employed to support multiclass fault classification, allowing the model to generalize across evolving fault types. Our contribution lies in the unique orchestration of unsupervised preprocessing, confidence-guided supervision, and transfer learning to adaptively retain prior knowledge while minimizing human annotation. This makes the proposed framework particularly well-suited for dynamic environments where labeled failure data is scarce and incrementally available.
•A novel self-learning model to classify unlabeled multivariate time-series.•Combines unsupervised learning, supervised learning, and transfer learning.•Reduces manual labeling needs in industrial fault diagnosis scenarios.•Model validated on a real-world multivariate time-series data with scarce labels.•It achieved similar performance to fully supervised DL models but with no labels.