Abstract
The proliferation of the Internet of Things (IoT) is occurring swiftly and is all-encompassing globally. The cyberattack on Dyn in 2016 brought to light the notable susceptibilities of intelligent networks. The issue of security in the realm of the IoT has emerged as a significant concern. The security of the IoT is compromised by the potential danger posed by exploiting devices connected to the Internet. The susceptibility of Things to botnets poses a significant threat to the entire Internet ecosystem (smart devices). Ensemble Learning (EL) is effective for detecting network attacks. Nonetheless, network traffic data and memory space requirements are typically substantial. Therefore, deploying the EL approach in IoT and Industrial-Internet-of-Things (IIoT) devices with limited memory is nearly impossible. Consequently, in this thesis, there are three main contributions.
A SMOTE-Stack EL for network intrusion detection in an IoT network is proposed using the Bot-IoT dataset to evaluate the classifier. According to preliminary results, the classifier shows lower metric scores for minority network categories after applying the Synthetic Minority Oversampling Technique (SMOTE) to address the class imbalance. Follow-up experiment results for the SMOTE-Stack outclass Stack and other state-of-the-art classifiers.
A novel ensemble feature dimensionality reduction technique (FI-PCA) is proposed. Feature Importance (FI) and Principal Component Analysis (PCA) are used to preprocess the network dataset. FI identifies the most important features in the data, while PCA is used to reduce dimensionality and denoise the data. Three single classifiers are employed to detect anomalies: Decision Tree (DT), Naive Bayes (NB), and Logistic Regression (LR). Preliminary results, however, show that these classifiers achieve average classification metric scores. On this basis, the Stack Ensemble Learning (SEL) method of combining single classifiers is used to improve the performance of the classifier further. Experimental results on varied feature dimensions of an IoT (Bot-IoT) dataset indicate that the proposed technique combined with the SEL could maintain the same level of classification performance for reduced dataset features. At the same time, a remarkable decrease is recorded for both training and test time.
Using the SEL combined with the FI for dimensionality reduction, the feature dimensions of an IoT/IIoT network traffic dataset are reduced. This work is an extension of the second contribution as a result of the remarkable performance of SEL combined with the feature dimensionality reduction technique (FI-PCA). The efficacy of the proposed lightweight EL method is evaluated by conducting extensive experiments using the Edge-IIoTset dataset, which contains IoT and IIoT network traces. FI considerably reduces the storage space needed to store extensive network traffic data by 93.4%. This surpasses the current state-of-the-art feature dimensionality reduction methods and significantly decreases the time required for training and testing. Despite a considerable decrease in feature dimensions, the SEL model displays adaptability and maintains excellent classification performance, similar to when all features from the dataset are employed for classifier assessment.