skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Comprehensive Analysis of Machine Learning based Intrusion Detection System for IoT-23 Dataset
With the proliferation of IoT devices, securing the IoT-based network is of paramount importance. IoT-based networks consist of diversely purposed IoT devices. This diversity of IoT devices necessitates diverse dataset analysis to ensure effective implementation of Machine Learning (ML)-based cybersecurity. However, much-demanded real-world IoT data is still in short supply for active ML-based IoT data analysis. This paper presents an in-depth analysis of the real-time IoT-23 dataset [9]. Exhaustive analysis of all 20 scenarios of the IoT-23 dataset reveals the consistency between feature selection methods and detection algorithms. The proposed ML-based intrusion detection system (ML-IDS) achieves significant improvement in detection accuracy, which in some scenarios reaches 100%. Our analysis also demonstrates that the required number of features for a high detection rate of greater than 99% remains small, i.e., 2 or 3, enabling ML-IDS implementation even with resource-constrained IoT processors.  more » « less
Award ID(s):
2029295
PAR ID:
10349757
Author(s) / Creator(s):
Date Published:
Journal Name:
INCoS-2022
Volume:
527
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Volunteer computing uses Internet-connected devices (laptops, PCs, smart devices, etc.), in which their owners volunteer them as storage and computing power resources, has become an essential mechanism for resource management in numerous applications. The growth of the volume and variety of data traffic on the Internet leads to concerns on the robustness of cyberphysical systems especially for critical infrastructures. Therefore, the implementation of an efficient Intrusion Detection System for gathering such sensory data has gained vital importance. In this article, we present a comparative study of Artificial Intelligence (AI)-driven intrusion detection systems for wirelessly connected sensors that track crucial applications. Specifically, we present an in-depth analysis of the use of machine learning, deep learning and reinforcement learning solutions to recognise intrusive behavior in the collected traffic. We evaluate the proposed mechanisms by using KDD’99 as real attack dataset in our simulations. Results present the performance metrics for three different IDSs, namely the Adaptively Supervised and Clustered Hybrid IDS (ASCH-IDS), Restricted Boltzmann Machine-based Clustered IDS (RBC-IDS), and Q-learning based IDS (Q-IDS), to detect malicious behaviors. We also present the performance of different reinforcement learning techniques such as State-Action-Reward-State-Action Learning (SARSA) and the Temporal Difference learning (TD). Through simulations, we show that Q-IDS performs with detection rate while SARSA-IDS and TD-IDS perform at the order of . 
    more » « less
  2. Internet of Things (IoT) devices have increased drastically in complexity and prevalence within the last decade. Alongside the proliferation of IoT devices and applications, attacks targeting them have gained popularity. Recent large-scale attacks such as Mirai and VPNFilter highlight the lack of comprehensive defenses for IoT devices. Existing security solutions are inadequate against skilled adversaries with sophisticated and stealthy attacks against IoT devices. Powerful provenance-based intrusion detection systems have been successfully deployed in resource-rich servers and desktops to identify advanced stealthy attacks. However, IoT devices lack the memory, storage, and computing resources to directly apply these provenance analysis techniques on the device. This paper presents ProvIoT, a novel federated edge-cloud security framework that enables on-device syscall-level behavioral anomaly detection in IoT devices. ProvIoT applies federated learning techniques to overcome data and privacy limitations while minimizing network overhead. Infrequent on-device training of the local model requires less than 10% CPU overhead; syncing with the global models requires sending and receiving 2MB over the network. During normal offline operation, ProvIoT periodically incurs less than 10% CPU overhead and less than 65MB memory usage for data summarization and anomaly detection. Our evaluation shows that ProvIoT detects fileless malware and stealthy APT attacks with an average F1 score of 0.97 in heterogeneous real-world IoT applications. ProvIoT is a step towards extending provenance analysis to resource-constrained IoT devices, beginning with well-resourced IoT devices such as the RaspberryPi, Jetson Nano, and Google TPU. 
    more » « less
  3. null (Ed.)
    Network intrusion detection systems (IDS) has efficiently identified the profiles of normal network activities, extracted intrusion patterns, and constructed generalized models to evaluate (un)known attacks using a wide range of machine learning approaches. In spite of the effectiveness of machine learning-based IDS, it has been still challenging to reduce high false alarms due to data misclassification. In this paper, by using multiple decision mechanisms, we propose a new classification method to identify misclassified data and then to classify them into three different classes, called a malicious, benign, and ambiguous dataset. In other words, the ambiguous dataset contains a majority of the misclassified dataset and is thus the most informative for improving the model and anomaly detection because of the lack of confidence for the data classification in the model. We evaluate our approach with the recent real-world network traffic data, Kyoto2006+ datasets, and show that the ambiguous dataset contains 77.2% of the previously misclassified data. Re-evaluating the ambiguous dataset effectively reduces the false prediction rate with minimal overhead and improves accuracy by 15%. 
    more » « less
  4. The advances in deep neural networks (DNN) have significantly enhanced real-time detection of anomalous data in IoT applications. However, the complexity-accuracy-delay dilemma persists: Complex DNN models offer higher accuracy, but typical IoT devices can barely afford the computation load, and the remedy of offloading the load to the cloud incurs long delay. In this article, we address this challenge by proposing an adaptive anomaly detection scheme with hierarchical edge computing (HEC). Specifically, we first construct multiple anomaly detection DNN models with increasing complexity and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network . We also incorporate a parallelism policy training method to accelerate the training process by taking advantage of distributed models. We build an HEC testbed using real IoT devices and implement and evaluate our contextual-bandit approach with both univariate and multivariate IoT datasets. In comparison with both baseline and state-of-the-art schemes, our adaptive approach strikes the best accuracy-delay tradeoff on the univariate dataset and achieves the best accuracy and F1-score on the multivariate dataset with only negligibly longer delay than the best (but inflexible) scheme. 
    more » « less
  5. The increasing prevalence of Internet of Things (IoT) devices has introduced significant challenges in digital forensic investigations, requiring new strategies for effective evidence prioritization and analysis. Traditional forensic methods struggle with data heterogeneity, volatility, and legal constraints, making IoT evidence collection complex and time-sensitive. This paper presents a weighted prioritization model (WPM) that ranks IoT devices based on six forensic criteria, enabling investigators to focus on highpriority evidence first, reducing data loss and optimizing forensic workflows. Through case studies in arson, homicide, and missing person investigations, we demonstrate how WPM enhances investigative decisionmaking and resource allocation in real-world forensic scenarios. The proposed framework offers a structured, scalable, and adaptable approach to IoT forensic investigations, improving efficiency, reliability, and legal compliance in digital evidence collection. 
    more » « less